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ASPERGILLUS ARAB I MO FURANOS I D AS E 



The present invention relates to an enzyme. In addition, the present invention relates to 
a nucleotide sequence coding for the enzyme. Also, the present invention relates to a 
5 promoter, wherein the promoter can be used to control the expression of the nucleotide 
sequence coding for the enzyme. 

In particular, the enzyme of the present invention is an arabinofuranosidase enzyme 
having arabinoxylan degrading activity. 

10 

It is known that it is desirable to direct expression of a gene of interest ("GOI") in certain 
tissues of an organism - such as a filamentous fimgus (such as Aspergillus Niger) oreven 
a plant crop. The resultant protein or enzyme may be useful for the organism itself. For 
example, it may be desirable to produce crop protein products with an optimised amino 
15 acid composition and so increase the nutritive value of a crop. For example, the crop 
may be made more useful as a feed. 

In the alternative, it may be desirable to isolate the resultant protein or enzyme and then 
use the protein or enzyme to prepare, for example, food compositions. In this regard, 
20 the resultant protein or enzyme can be a component of the food composition or it can be 
used to prepare food compositions, including altering the characteristics or appearance 
of food compositions. It may even be desirable to use the organism, such as a 
filamentous fungus or a crop plant, to express non-plant genes, such as for the same 
purposes. 

25 

Also, it may be desirable to use an organism, such as a filamentous fungus or a crop 
plant, to express mammalian genes. Examples of the latter products include interferons, 
insulin, blood factors and plasminogen activators. It is also desirable to use micro- 
organisms, such as filamentous fungi, to prepare products from GOIs by use of promoters 
30 that are active in the micro-organisms. 
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Fruit and vegetable cell walls largely consist of polysaccharide t the major components 
being pectin, cellulose and xyloglucan (R.R. Selvendran and J. A. Robertson, IFR Repon 
1989). Numerous cell wall models have been proposed which attempt to incorporate the 
essential properties of strength and flexibility (P. Albersheim, Sci. Am. 232. 81-95, 
5 1975; P. Albersheim, Plant Biochem. 3rd Edition (Bonner and Varner), Ac. Press, 1976; 
T. Hayashi, Ann. Rev. Plant Physiol. & Plant Mol. Biol., 40, 139-168, 1989). 

The composition of the plant cell wall is complex and variable. Polysaccharides are 
mainly found in the form of long chains of cellulose (the main structural component of 
the plant cell wall), hemicellulose (comprising various fl-xylan chains) and pectic 
substances (consisting of galacturonans and rhamnogalacturonans; arabinans; and 
galactans and arabinogalactans). From the standpoint of the food industry, the pectic 
substances, arabinans in particular, have become one of the most important constituents 
of plant cell walls (Whitaker, J.R. (1984) Enzyme Microb. TechnoL, 6,341). 

One form of plant polysaccharide is arabinan. A review of arabinans may be found in 
EP-A-0506190. According to this document, arabinans consist of a main chain of a- 
(l-*5) groups linked to one another. Side chains are linked a-(W3) or sometimes a- 
(l-*2) to the main a~(l-*5)-L-arabinan backbone. In apple, for example, one third of the 
total arabinose is present in the side chains. The molecular weight of arabinan is 
normally about 15 kDa. 

Arabinans are degraded by enzymes collectively called arabinases. In this regard, 
arabinan-degrading activity is the ability of an enzyme to release arabinose residues, 
25 either monomers or oligomers, from arabinan backbones or from arabinan-containing side 
chains of other hemicellulose backbone structures such as arabinogalactans, or even the 
release of arabinose monomers via the cleavage of the l-*6 linkage between the terminal 
arabinofuranosyl unit and the intermediate glucosyl unit of monoterpenyl a-L- 
arabinofuranosyl glucosides. 

30 
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The activity of the arabinan degrading enzymes of EP-A-0506190 include: a) the ability 
to cleave (l-*2)-a-L-arabinosidic linkages; b) the ability to cleave (l-*3)-a-L-arabinosidic 
linkages; c) the ability to cleave (l-*5)-a-L-arabinosidic linkages; d) the ability to cleave 
the l-*6 linkage between the terminal arabinofuranosyl unit and the intermediate glucosyl 
5 unit of monoterpenyl a-L-arabinofuranosyl glucosides. 

Arabinan-degrading enzymes are known to be produced by a variety of plants and 
microorganisms, among these, fungi such as those of the genera Aspergillus, Corticium y 
Rhodotorula (Kaji, A. (1984) Adv. Carbohydr. Chem. Biochem., 42, 383), Dichotomies 
10 (Brillouet et al. (1985) Carbohydrate Research, 144, 113), Ascomycetes and 
Basidomycetes (Sydow, G. (1977) DDR Patent Application No. 124,812). 

Another plant polysaccharide is xylan, whose major monosaccharide unit is xylose. 
Xylans are abundant components of the hemicelluloses. In monocotyledonous plants the 
15 dominant hemtcellulose is an arabinoxylan, in which arabinose side chains are attached 
to a backbone of xylose residues. 

Arabinoxylans are carbohydrates found in the cell wall of cereals. A review of 
arabinoxylans and the enzymatic degradation thereof may be found in Voragen et al 
20 (1992 Characterisation of Cereal Arabinoxylans, Xylans and Xylanases pages 51-67, 
edited by J. Visser published by Elsevier Science Publishers). 

Typically, arabinoxylans comprise a xylose backbone linked together via 0-1,4- bonds. 
The xylose backbone is substituted with L-arabinose residues which are linked via a-l 

25 bonds to the 2 or 3 position of the xylose residues. The xylose residues can be single or 
double substituted. In addition to substitution with arabinose the xylose residues can be 
substituted with acetyl groups, glucuronic acid and various other carbohydrates. The 
arabinose residues can be further substituted with phenolic acids such as ferulic acid and 
coumaric acid. The degree and kind of substitution depends on the source of the 

30 particular arabinoxylan. 
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Arabinoxylans are found in cereal cell wall where they are part of the secondary cell 
wall. Arabinoxylans form about 3 % of wheat flour - part of it is water soluble (WSP), 
pan of it is water insoluble (WIP). 

5 Despite the fact that the arabinoxylans amount to only about 3 % of wheat the importance 
of the arabinoxylan fraction is much higher. This is because the arabinoxylans of cereals 
act as hydrocolloids, as they form a gel like structure with water. For example, the 
arabinoxylans of wheat flour bind up to 30% of the water in a dough despite the fact that 
they amount to only 3 % of the dry matter. When arabinoxylans bind water they 
10 increase the viscosity of the ground cereals and to such an extent that the cereals can 
become difficult to manage. 

The Theological properties of several systems where ground cereals are used can be 
manipulated using enzymes that degrade arabinoxylans. In modern bakery it is 
15 advantageous to reduce the viscosity of the dough in order to reduce the energy needed 
to process the doughs and also to get a higher volume of the bread. This is usually 
achieved by using enzymes that can degrade the xylose backbone of arabinoxylans. 

Enzymes that only cleave the arabinose side chains from the xylan backbone of 
20 arabinoxylan are, for the purposes of this application, collectively called arabinoxylan 
degrading enzymes. 

In feeds based on cereals, arabinoxylans in the cereals can increase the viscosity of the 
fluids in the intestines of the animals after the feeds have been ingested. This is a 

25 problem as it causes discomfort, such as indigestion, to the animals. Also, the nutritive 
value of the feeds is reduced. These problems can be avoided by addition of enzymes 
that degrade the arabinoxylan (such as xylanases) to the feed to avoid indigestion and to 
increase the nutritive value of the feed. However, some enzymes that degrade the 
arabinoxylans (especially some of the xylanases) require the presence of unsubstituted 

30 backbones and so their activity can be limited. 
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Further discussions on arabinoxylans can be found in Xylans and Xylanases (1992, edited 
by J. Visser published by Elsevier Science Publishers). 

An arbinoxylan degrading enzyme is (l,4)-fi-D-arabinoxylan arabinofuranohydrolase 
5 (AXH), as described by Kormelink et al 1991 (Kormelink, F.J.M., Searle-Van Leeuwen 
M.J.F., Wood. T.M., Voragen, A. G.J. (1991) Purification and characterization of a 
(1 ,4)-/?-D-arabinoxylan arabinofuranohydrolase from Aspergillus awamori. Appl. Micro- 
biol. Biotechnol. 25:753-758). However, this document provides no sequence data for 
the enzyme or the nucleotide sequence coding for same or for the promoter for the same. 

10 

Clearly, it would be useful to be able to degrade arabinoxylans, preferably by use of 
recombinant DNA techniques. 

The present invention seeks to provide an enzyme having arabinoxylan degrading activity; 
15 preferably wherein the enzyme can be prepared in certain or specific cells or tissues, such 
as in just a specific cell or tissue, of an organism, typically a filamentous fungus, 
preferably of the genus Aspergillus, such as Aspergillus niger, or even a plant. 

Also, the present invention seeks to provide a GOI coding for the enzyme that can be 
20 expressed preferably in specific cells or tissues, such as in certain or specific cells or 
tissues, of an organism, typically a filamentous fungus, preferably of the genus 
Aspergillus, such as Aspergillus niger, or even a plant. 

In addition, the present invention seeks to provide a promoter that is capable of directing 
25 expression of a GOI, such as a nucleotide sequence coding for the enzyme according to 
the present invention, preferably in certain specific cells or tissues, such as in just a 
specific cell or tissue, of an organism, typically a filamentous fungus, preferably of the 
genus Aspergillus, such as Aspergillus niger, or even a plant. Preferably, the promoter 
is used in Aspergillus wherein the product encoded by the GOI is excreted from the host 
30 organism into the surrounding medium. 
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Furthermore, the present invention seeks to provide constructs, vectors, plasmids, cells, 
tissues, organs and organisms comprising the GOI and/or the promoter, and methods of 
expressing the same, preferably in specific cells or tissues, such as expression in just a 
specific cell or tissue, of an organism, typically a filamentous fungus, preferably of the 
5 genus Aspergillus, or even a plant. 

According to a first aspect of the present invention there is provided an enzyme 
obtainable from Aspergillus, wherein the enzyme has the following characteristics: a MW 
of 33,270 D ± 50 D; a pi value of about 3.7, arabinoxylan degrading activity; a pH 
10 optima of from about 2.5 to about 7.0 (more especially from about 3.3 to about 4.6, 
more especially about 4); a temperature optima of from about 40°C to about 60°C (more 
especially from about 45°C to about 55°C, more especially about 50°C); and wherein the 
enzyme is capable of cleaving arabinose from the xylose backbone of an arabinoxylan. 

15 According to a second aspect of the present invention there is provided an enzyme having 
the sequence shown as SEQ. I.D. No. 1 or a variant, homologue or fragment thereof. 

According to a third aspect of the present invention there is provided an enzyme coded 
by the nucleotide sequence shown as SEQ. I.D. No. 2 or a variant, homologue or 
20 fragment thereof or a sequence complementary thereto. 

According to a fourth aspect of the present invention there is provided a nucleotide 
sequence coding for the enzyme according to the present invention. 

25 According to a fifth aspect of the present invention there is provided a nucleotide 
sequence having the sequence shown as SEQ. I.D. No. 2 or a variant, homologue or 
fragment thereof or a sequence complementary thereto. 

According to a sixth aspect of the present invention there is provided a promoter having 
30 the sequence shown as SEQ. I.D. No. 3 or a variant, homologue or fragment thereof or 
a sequence complementary thereto. 
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According to a seventh aspect of the present invention there is provided a terminator 
having the nucleotide sequence shown as SEQ. I. D. No. 13 or a variant, homologue or 
fragment thereof or a sequence complementary thereto. 

5 According to an eighth aspect of the present invention there is provided a signal sequence 
having the nucleotide sequence shown as SEQ. LD. No. 14 or a variant, homologue or 
fragment thereof or a sequence complementary thereto. 

According to a ninth aspect of the present invention there is provided a process for 
10 expressing a GOI by use of a promoter, wherein the promoter is the promoter according 
to the present invention. 

According to a tenth aspect of the present invention there is provided the use of an 
enzyme according to the present invention to degrade an arabinoxylan. 

15 

According to an eleventh aspect of the present invention there is provided a combination 
of enzymes to degrade an arabinoxylan, the combination comprising an enzyme according 
to the present invention and a xylanase. 

20 According to a twelfth aspect of the present invention there is provided plasmid NCIMB 
40703, or a nucleotide sequence obtainable therefrom for expressing an enzyme capable 
of degrading arabinoxylan or for controlling the expression thereof or for controlling the 
expression of another GOI. 

25 According to a thirteenth aspect of the present invention there is provided a signal 
sequence having the sequence shown as SEQ. LD. No. 15 or a variant, homologue or 
fragment thereof. 

According to a fourteenth aspect of the present invention there is provided the use of the 
30 enzyme according to the present invention in the manufacture of a medicament or 
foodstuff to reduce or prevent indigestion and/or increase digestibility and/or increase 
nutrient absorption. 
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According to a fifteenth aspect of the present invention there is provided an 
arabinofuranosidase enzyme having arabinoxylan degrading activity, which is 
immunologically reactive with an antibody raised against a purified arabinofuranosidase 
enzyme having the sequence shown as SEQ. I.D. No. 1. 

5 

According to a sixteenth aspect of the present invention there is provided an 
arabinofuranosidase promoter wherein the promoter is inducible by an intermediate in 
xylose metabolism. 

10 According to a seventeenth aspect of the present invention there is provided a process of 
reducing the viscosity of a branched substrate wherein the enzyme degrades the branches 
of the substrate but not the backbone of the substrate. 

According to a further aspect of the present invention there is provided the use of the 
15 enzyme of the present invention as a viscosity modifier. 

According to a further aspect of the present invention there is provided the use of the 
enzyme of the present invention to reduce the viscosity of pectin. 

20 Other aspects of the present invention include constructs, vectors, plasmids, cells, tissues, 
organs and transgenic organisms comprising the aforementioned aspects of the present 
invention. 

Other aspects of the present invention include methods of expressing or allowing 
25 expression or transforming any one of the nucleotide sequence, the construct, the 
plasmid, the vector, the cell, the tissue, the organ or the organism, as well as the 
products thereof. 

Additional aspects of the present invention include uses of the promoter for expressing 
30 GOIs in culture media such as a broth or in a transgenic organism. 
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Further aspects of the present invention include uses of the enzyme for preparing or 
treating foodstuffs, including animal feed. 

Preferably the enzyme is coded by the nucleotide sequence shown as SEQ I D No. 2 
5 or a variant, homologue or fragment thereof or a sequence complementary thereto. 

Preferably the nucleotide sequence has the sequence shown as SEQ. I D. No. 2 or a 
variant, homologue or fragment thereof or a sequence complementary thereto. 

10 Preferably the nucleotide sequence is operatively linked to a promoter. 

Preferably the promoter comprises the sequence CCAAT. 

Preferably the promoter is the promoter having the sequence shown as SEQ. I.D. No. 
15 3 or a variant, homologue or fragment thereof or a sequence complementary thereto. 

Preferably, the promoter comprises the 100 bps sequence from the Xma 111 to the 
BamHl sites. 

20 Preferably the promoter of the present invention is operatively linked to a GOI. 

Preferably the GOI comprises a nucleotide sequence according to the present invention. 
Preferably the transgenic organism is a fungus. 

25 

Preferably the transgenic organism is a filamentous fungus, more preferably of the genus 
Aspergillus. 

Preferably the transgenic organism is a plant. 

30 

Preferably, in the use, the enzyme is used in combination with a xylanase, preferably an 
endoxylanase. 
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Highly preferred embodiments of each of the aspects of the present invention do not 
include any one of the native enzyme, the native promoter or the native nucleotide 
sequence in its natural environment. 

5 Preferably, in any one of the plasmid, the vector such as an expression vector or a 
transformation vector, the cell, the tissue, the organ, the organism or the transgenic 
organism, the promoter is present in combination with at least one GOI. 

Preferably the promoter and the GOI are stably incorporated within the transgenic 
10 organism's genome. 

Preferably the transgenic organism is a filamentous fungus, preferably of the genus 
Aspergillus, more preferably Aspergillus niger. The transgenic organism can even be a 
plant, such as a monocot or dicot plant. 

15 

A highly preferred embodiment is an enzyme obtainable from Aspergillus, wherein the 
enzyme has the following characteristics: a MW of 33,270 D ± 50 D; a pi value of 
about 3.7; arabinoxylan degrading activity; a pH optima of from about 2.5 to about 7.0 
(more especially from about 3.3 to about 4.6, more especially about 4); a temperature 
20 optima of from about 40°C to about 60°C (more especially from about 45°C to about 
55°C, more especially about 50°C); and wherein the enzyme is capable of cleaving 
arabinose from the xylose backbone of an arabinoxylan; wherein the enzyme has the 
sequence shown as SEQ. I.D. No. 1 or a variant, homologue or fragment thereof. 

25 Another highly preferred embodiment is an enzyme obtainable from Aspergillus, wherein 
the enzyme has the following characteristics: a MW of 33,270 D ± 50 D; a pi value of 
about 3.7; arabinoxylan degrading activity; a pH optima of from about 2.5 to about 7.0 
(more especially from about 3.3 to about 4.6, more especially about 4); a temperature 
optima of from about 40°C to about 60°C (more especially from about 45°C to about 

30 55°C, more especially about 50°C); and wherein the enzyme is capable of cleaving 
arabinose from the xylose backbone of an arabinoxylan; wherein the enzyme is coded by 
the nucleotide sequence shown as SEQ. I.D. No. 2 or a variant, homologue or fragment 
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thereof or a sequence complementary thereto. 

The advantages of the present invention are that it provides a means for preparing an 
arabinofuranosidase enzyme having arabinoxylan degrading activity and the nucleotide 
5 sequence coding for the same. In addition, it provides a promoter that can control the 
expression of that, or another, nucleotide sequence. 

Other advantages are that the enzyme of the present invention can affect the viscosity of 
ground cereals, such as dough, to ease the handling thereof and for example to get a 
10 higher volume of the bread. 

The enzyme of the present invention is also advantageous for feed because it degrades 
arabinoxylan and thus increases the nutritive value of the feed. In addition, it reduces 
the viscosity of the arabinoxylan in the intestine of the animals and so reduces or prevents 
15 indigestion. 

The combination of the use of the enzyme of the present invention with a xylanase is 
particularly advantageous because the enzyme of the present invention and the xylanase 
have a surprising and unexpected synergistic effect with each other. 

20 

In this regard, the enzyme of the present invention increases the degradative effect of the 
xylanase, and the xylanase increases the degradative effect of the enzyme of the present 
invention. It is believed that the activity of the xylanase is increased because the enzyme 
of the present invention provides a polysaccharide substrate having fewer substituted 
25 groups. 

The present invention therefore provides an enzyme having arabinoxylan degrading 
activity wherein the enzyme can be prepared in certain or specific cells or tissues, such 
as in just a specific cell or tissue, of an organism, typically a filamentous fungus, 
30 preferably of the genus Aspergillus, such as Aspergillus niger. The enzyme may even be 
prepared in a plant. 
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More in particular, the enzyme of the present invention is capable of specifically cleaving 
arabinose from the xylose backbone of arabinoxylan. 

The arabinofuranosidase of the present invention is different from the 
5 arabinofuranosidases previously known. In this regard, the previous described 
arabinofuranosidases - such as those of EP-A-0506190 - are characterised by their ability 
to degrade unbranched arabinan, and are assayed using p-rutrophenyl-arabinoside. 

The arabinofuranosidase of the present invention does not degrade unbranched arabinan, 
10 and only a minor activity is seen on nitrophenyl-arabinoside. In contrast, the 
arabinofuranosidase of the present invention is useful for degrading arabinoxylan. 
Therefore, the arabinofuranosidase of the present invention is quite different from the 
previous isolated arabinofuranosidases. 

15 Also, the present invention provides a GOI coding for the enzyme that can be expressed 
preferably in specific cells or tissues, such as in certain or specific cells or tissues, of an 
organism, typically a filamentous fungus, preferably of the genus Aspergillus, such as 
Aspergillus niger. The GOI may even be expressed in a plant. 

20 In addition, the present invention provides a promoter that is capable of directing 
expression of a GOI, such as a nucleotide sequence coding for the enzyme according to 
the present invention, preferably in certain specific cells or tissues, such as in just a 
specific cell or tissue, of an organism, typically a filamentous fungus, preferably of the 
genus Aspergillus, such is Aspergillus niger, or even a plant. Preferably, the promoter 

25 is used in Aspergillus wherein the product encoded by the GOI is excreted from the host 
organism into the surrounding medium. The promoter may even be tailored (if 
necessary) to express a GOI in a plant. 

The present invention also provides constructs, vectors, plasmids, cells, tissues, organs 
30 and organisms comprising the GOI and/or the promoter, and methods of expressing the 
same, preferably in specific cells or tissues, such as expression in just a specific cell or 
tissue, of an organism, typically a filamentous fungus, preferably of the genus 
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The terms "variant", "homologue" or "fragment" in relation to the enzyme include any 
substitution of, variation of, modification of, replacement of, deletion of or addition of 
5 one (or more) amino acid from or to the sequence providing the resultant amino acid 
sequence has arabinoxylan degrading activity, preferably having at least the same activity 
of the enzyme shown in the sequence listings (SEQ ID, No. 1 or 12). In particular, the 
term "homologue" covers homology with respect to structure and/or function providing 
the resultant enzyme has arabinoxylan degrading activity. With respect to sequence 
10 homology, preferably there is at least 75%, more preferably at least 85%, more 
preferably at least 90% homology to SEQ ID NO. 1 shown in the attached sequence 
listings. More preferably there is at least 95%, more preferably at least 98%, homology 
to SEQ ID NO. 1 shown in the attached sequence listings. 

The terms "variant", "homologue" or "fragment" in relation to the nucleotide sequence 
coding for the enzyme include any substitution of, variation of, modification of, 
replacement of, deletion of or addition of one (or more) nucleic acid from or to the 
sequence providing the resultant nucleotide sequence codes for an enzyme having 
arabinoxylan degrading activity, preferably having at least the same activity of the 
enzyme shown in the sequence listings (SEQ LD. No. 2 or 12). In particular, the term 
"homologue" covers homology with respect to structure and/or function providing the 
resultant nucleotide sequence codes for an enzyme having arabinoxylan degrading 
activity. With respect to sequence homology, preferably there is at least 75%, more 
preferably at least 85%, more preferably at least 90% homology to SEQ ID NO. 2 shown 
in the attached sequence listings. More preferably there is at least 95%, more preferably 
at least 98%, homology to SEQ ID NO. 2 shown in the attached sequence listings 

The terms "variant", "homologue" or "fragment" in relation to the promoter include any 
substitution of, variation of, modification of, replacement of, deletion of or addition of 
30 one (or more) nucleic acid from or to the sequence providing the resultant nucleotide 
sequence has the ability to act as a promoter in an expression system - such as the 
transformed cell or the transgenic organism according to the present invention. In 
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particular, the term "homologue" covers homology with respect to structure and/or 
function providing the resultant nucleotide sequence has the ability to act as a promoter. 
With respect to sequence homology, preferably there is at least 75%, more preferably at 
least 85%, more preferably at least 90% homology to SEQ ID NO. 3 shown in the 
5 attached sequence listings. More preferably there is at least 95%, more preferably at 
least 98%, homology to SEQ ID NO. 3 shown in the attached sequence listings. 

The terms "variant", "homologue" or "fragment" in relation to the terminator or signal 
nucleotide sequences include any substitution of, variation of, modification of, 

10 replacement of, deletion of or addition of one (or more) nucleic acid from or to the 
sequence providing the resultant nucleotide sequence has the ability to act as a terminator 
or codes for an amino acid sequence that has the ability to act as a signal sequence 
respectively in an expression system - such as the transformed cell or the transgenic 
organism according to the present invention. In particular, the term "homologue" covers 

15 homology with respect to structure and/or function providing the resultant nucleotide 
sequence has the ability to act as or code for a terminator or signal respectively. With 
respect to sequence homology, preferably there is at least 75%, more preferably at least 
85%, more preferably at least 90% homology to SEQ ID NO.s 13 and 14 (respectively) 
shown in the attached sequence listings. More preferably there is at least 95%, more 

20 preferably at least 98%, homology to SEQ ID NO.s 13 and 14 (respectively) shown in 
the attached sequence listings. 

The terms "variant", "homologue" or "fragment" in relation to the signal amino acid 
sequence include any substitution of, variation of, modification of, replacement of, 

25 deletion of or addition of one (or more) amino acid from or to the sequence providing 
the resultant sequence has the ability to act as a signal sequence in an expression system - 
such as the transformed cell or the transgenic organism according to the present 
invention. In particular, the term "homologue" covers homology with respect to structure 
and/or function providing the resultant nucleotide sequence has the ability to act as or 

30 code for a signal respectively. With respect to sequence homology, preferably there is 
at least 75%, more preferably at least 85%, more preferably at least 90% homology to 
SEQ ID NO 15 shown in the attached sequence listings. More preferably there is at least 
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95%, more preferably at least 98%, homology to SEQ ID NO 15 shown in the attached 
sequence listings 

The above terms are synonymous with allelic variations of the sequences. 

5 

The term "complementary" means that the present invention also covers nucleotide 
sequences that can hybridise to the nucleotide sequences of the coding sequence or the 
promoter sequence, respectively. 

10 The term "nucleotide" in relation to the present invention includes genomic DNA, cDNA, 
synthetic DNA, and RNA. Preferably it means DNA, more preferably cDNA for the 
coding sequence of the present invention. 

The term "construct" - which is synonymous with terms such as "conjugate", "cassette" 
15 and "hybrid" - includes a GOI directly or indirectly attached to a promoter. An example 
of an indirect attachment is the provision of a suitable spacer group such as an intron 
sequence, such as the SftZ-intron or the ADH intron, intermediate the promoter and the 
GOI. The same is true for the term "fused" in relation to the present invention which 
includes direct or indirect attachment. In each case, it is highly preferred that the terms 
20 do not cover the natural combination of the gene coding for the enzyme ordinarily 
associated with the wild type gene promoter and when they are both in their natural 
environment. A highly preferred embodiment is the or a GOI being operatively linked 
to a or the promoter. 

25 The construct may even contain or express a marker which allows for the selection of the 
genetic construct in, for example, a filamentous fungus, preferably of the genus 
Aspergillus, such as Aspergillus niger, or plants, preferably cereals, such as maize, rice, 
barley etc., into which it has been transferred. Various markers exist which may be 
used, such as for example those encoding mannose-6-phosphate isomerase (especially for 

30 plants) or those markers that provide for antibiotic resistance - e.g. resistance to G418, 
hygromycin, bleomycin, kanamycin and gentamycin. 
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The term "vector" includes expression vectors and transformation vectors. 



The term "expression vector" means a construct capable of in vivo or in vitro expression. 

5 The term "transformation vector" means a construct capable of being transferred from 
one species to another - such as from an E.coli plasmid to a filamentous fungus, 
preferably of the genus Aspergillus. It may even be a construct capable of being 
transferred from an E.coli plasmid to an Agrobacterium to a plant. 

10 The term "tissue" includes tissue per se and organ. 

The term "organism" in relation to the present invention includes any organism that could 
comprise the promoter according to the present invention and/or the nucleotide sequence 
coding for the enzyme according to the present invention and/or products obtained 
15 therefrom, wherein the promoter can allow expression of a GOI and/or wherein the 
nucleotide sequence according to the present invention can be expressed when present in 
the organism. 

Preferably the organism is a filamentous fungus, preferably of the genus Aspergillus, 
20 more preferably Aspergillus niger. 

The term "transgenic organism" in relation to the present invention includes any organism 
that comprises the promoter according to the present invention and/or the nucleotide 
sequence coding for the enzyme according to the present invention and/or products 
25 obtained therefrom, wherein the promoter can allow expression of a GOI and/or wherein 
the nucleotide sequence according to the present invention can be expressed within the 
organism. Preferably the promoter and/or the nucleotide sequence is (are) incorporated 
in the genome of the organism. 

30 Preferably the transgenic organism is a filamentous fungus, preferably of the genus 
Aspergillus, more preferably Aspergillus niger. 
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Therefore, the transgenic organism of the present invention includes an organism 
comprising any one of, or combinations of, the promoter according to the present 
invention, the nucleotide sequence coding for the enzyme according to the present 
invention, constructs according to the present invention, vectors according to the present 
5 invention, plasmids according to the present invention, cells according to the present 
invention, tissues according to the present invention or the products thereof. For example 
the transgenic organism can comprise a GOI, preferably an exogenous nucleotide 
sequence, under the control of the promoter according to the present invention. The 
transgenic organism can also comprise the nucleotide sequence coding for the enzyme of 
10 the present invention under the control of a promoter, which may be the promoter 
according to the present invention. 

In a highly preferred embodiment, the transgenic organism does not comprise the 
combination of the promoter according to the present invention and the nucleotide 

15 sequence coding for the enzyme according to the present invention, wherein both the 
promoter and the nucleotide sequence are native to that organism and are in their natural 
environment. Thus, in these highly preferred embodiments, the present invention does 
not cover the native nucleotide coding sequence according to the present invention in its 
natural environment when it is under the control of its native promoter which is also in 

20 its natural environment. In addition, in this highly preferred embodiment, the present 
invention does not cover the native enzyme according to the present invention when it is 
in its natural environment and when it has been expressed by its native nucleotide coding 
sequence which is also in its natural environment and when that nucleotide sequence is 
under the control of its native promoter which is also in its natural environment. 

25 

The term "promoter" is used in the normal sense of the art, e.g. an RNA polymerase 
binding site in the Jacob-Mond theory of gene expression. 

In one aspect, the promoter of the present invention is capable of expressing a GOI, 
30 which can be the nucleotide sequence coding for the enzyme of the present invention. 
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In another aspect, the nucleotide sequence according to the present invention is under the 
control of a promoter that allows expression of the nucleotide sequence. In this regard, 
the promoter need not necessarily be the same promoter as that of the present invention. 
In this aspect, the promoter may be a cell or tissue specific promoter. If, for example, 
5 the organism is a plant then the promoter can be one that affects expression of the 
nucleotide sequence in any one or more of stem, sprout, root and leaf tissues. 

By way of example, the promoter for the nucleotide sequence of the present invention can 
be the a-Amy 1 promoter (otherwise known as the Amy 1 promoter, the Amy 637 
10 promoter or the a- Amy 637 promoter) as described in our co-pending UK patent 
application No. 9421292.5 filed 21 October 1994. That promoter comprises the sequence 
shown in Figure 1 . 

Alternatively, the promoter for the nucleotide sequence of the present invention can be 
15 the a- Amy 3 promoter (otherwise known as the Amy 3 promoter, the Amy 351 promoter 
or the a-Amy 351 promoter) as described in our co-pending UK patent application No. 
9421286.7 filed 21 October 1994. That promoter comprises the sequence shown in 
Figure 2. 

20 Preferably, the promoter is the promoter of the present invention. 

In addition to the nucleotide sequences described above, the promoters, particularly that 
of the present invention, could additionally include features to ensure or to increase 
expression in a suitable host. For example, the features can be conserved regions such 
25 as a Pribnow Box or a TATA box. The promoters may even contain other sequences to 
affect (such as to maintain, enhance, decrease) the levels of expression of the GOI. For 
example, suitable other sequences include the S/iZ-intron or an ADH intron. Other 
sequences include inducible elements - such as temperature, chemical, light or stress 
inducible elements. 

30 

Also, suitable elements to enhance transcription or translation may be present. An 
example of the latter element is the TMV 5' signal sequence (see Sleat Gene 217 [1987] 
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217-225; and Dawson Plant Mol. Biol. 23 [1993] 97). 
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In addition the present invention also encompasses combinations of promoters and/or 
nucleotide sequences coding for proteins or enzymes and/or elements. For example, the 
5 present invention encompasses the combination of a promoter according to the present 
invention operatively linked to a GOI, which could be a nucleotide sequence according 
to the present invention, and another promoter such as a tissue specific promoter 
operatively linked to the same or a different GOI. 

10 The present invention also encompasses the use of promoters to express a nucleotide 
sequence coding for the enzyme according to the present invention, wherein a part of the 
promoter is inactivated but wherein the promoter can still function as a promoter. Partial 
inactivation of a promoter in some instances is advantageous. 

15 In particular, with the Amy 351 promoter mentioned earlier it is possible to inactivate a 
part of it so that the partially inactivated promoter expresses GOIs in a more specific 
manner such as in just one specific tissue type or organ. 

The term "inactivated" means partial inactivation in the sense that the expression pattern 
20 of the promoter is modified but wherein the partially inactivated promoter still functions 
as a promoter. However, as mentioned above, the modified promoter is capable of 
expressing a GOI in at least one (but not all) specific tissue of the original promoter. 
One such promoter is the Amy 351 promoter described above. 

25 Examples of partial inactivation include altering the folding pattern of the promoter 
sequence, or binding species to parts of the nucleotide sequence, so that a part of the 
nucleotide sequence is not recognised by, for example, RNA polymerase. Another, and 
preferable, way of partially inactivating the promoter is to truncate it to form fragments 
thereof. Another way would be to mutate at least a part of the sequence so that the RNA 

30 polymerase can not bind to that part or another part. 
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Another modification is to mutate the binding sites for regulatory proteins for example 
the CreA protein known from filamentous fungi to exert carbon catabolite repression, and 
thus abolish the catabolite repression of the native promoter. 

5 The term "GOI" with reference to the present invention means any gene of interest. A 
GOI can be any nucleotide that is either foreign or natural to the organism (e.g. 
filamentous fungus, preferably of the genus Aspergillus, or a plant) in question. Typical 
examples of a GOI include genes encoding for proteins and enzymes that modify 
metabolic and catabolic processes. The GOI may code for an agent for introducing or 
10 increasing pathogen resistance. The GOI may even be an antisense construct for 
modifying the expression of natural transcripts present in the relevant tissues. The GOI 
may even code for a non-natural protein of a filamentous fungus, preferably of the genus 
Aspergillus, or a compound that is of benefit to animals or humans. 

15 For example, the GOI could code for a pharmaceutical^ active protein or enzyme such 
as any one of the therapeutic compounds insulin, interferon, human serum albumin, 
human growth factor and blood clotting factors. In this regard, the transformed cell or 
organism could prepare acceptable quantities of the desired compound which would be 
easily retrievable from, the cell or organism. The GOI may even be a protein giving 

20 nutritional value to a food or crop. Typical examples include plant proteins that can 
inhibit the formation of anti-nutritive factors and plant proteins that have a more desirable 
amino acid composition (e.g. a higher lysine content than a non-trans genie plant). The 
GOI may even code for an enzyme that can be used in food processing such as chymosin, 
thaumatin and a-galactosidase. The GOI can be a gene encoding for any one of a pest 

25 toxin, an antisense transcript such as that for patatin or a- amylase, ADP-glucose 
pyrophosphorylase (e.g. see EP-A-0455316), a protease antisense or a glucanase. 

The GOI can be the nucleotide sequence coding for the a-amylase enzyme which is the 
subject of our co-pending UK patent application 9413439.2 filed on 4 July 1994, the 
30 sequence of which is shown in Figure 3. The GOI can be the nucleotide sequence coding 
for the a-amylase enzyme which is the subject of our co-pending UK patent application 
9421290.9 filed on 21 October 1994, the sequence of which is shown in Figure 4. The 
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GOI can be any of the nucleotide sequences coding for the ADP-glucose 
pyrophosphatase enzymes which are the subject of our co-pending PCT patent 
application PCT/EP94/01082 filed 7 April 1994, the sequences of which are shown in 
Figures 5 and 6. The GOI can be any of the nucleotide sequences coding for the o- 
5 glucan lyase enzyme which are described in our co-pending PCT patent application 
PCT/EP94/03397 filed 15 October 1994, the sequences of which are shown in Figures 
7-10. 

In one preferred embodiment, the GOI is a nucleotide sequence coding for the enzyme 
10 according to the present invention. 

As mentioned above, a preferred host organism is of the genus Aspergillus, such as 
Aspergillus niger. The transgenic Aspergillus according to the present invention can be 
prepared by following the teachings of RambosekJ. and Leach, J. 1987 (Recombinant 

15 DNA in filamentous fungi: Progress and Prospects. CRC Crit. Rev. Biotechnol. 6:357- 
393), Davis R. W. 1994 (Heterologous gene expression and protein secretion in Asperg- 
illus. In: Martineili S.D., Kinghorn J.R.( Editors) Aspergillus: 50 years on. Progress in 
industrial microbiology vol 29. Elsevier Amsterdam 1994. pp 525-560), Ballance,DJ. 
1991 (Transformation systems for Filamentous Fungi and an Overview of Fungal Gene 

20 structure. In :Leong,S. A. , Berka R.M. (Editors) Molecular Industrial Mycology. Systems 
and Applications for Filamentous Fungi. Marcel Dekker Inc. New York 1991. pp 1-29) 
and Turner G. 1994 (Vectors for genetic manipulation. In: Martineili S.D., Kinghorn 
J.R.( Editors) Aspergillus: 50 years on. Progress in industrial microbiology vol 29. 
Elsevier Amsterdam 1994. pp. 641-666). However, the following commentary provides 

25 a summary of those teachings for producing transgenic Aspergillus according to the 
present invention. 

Filamentous fungi have during almost a century been widely used in industry for produc- 
tion of organic compounds and enzymes. Traditional Japanese koji and soy fermentations 
30 have used Aspergillus sp for hundreds of years. In this century Aspergillus niger has 
been used for production of organic acids particular citric acid and for production of 
various enzymes for use in industry. 
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There are two major reasons for that filamentous fungi have been so widely used in 
industry. First filamentous fungi can produce high amounts of extracellular products , for 
example enzymes and organic compounds such as antibiotics or organic acids. Second 
filamentous fungi can grow on low cost substrates such as grams, bran, beet pulp etc. 
5 The same reasons have made filamentous fungi attractive organisms as hosts for 
heterologous expression according to the present invention. 

In order to prepare the transgenic Aspergillus, expression constructs are prepared by 
inserting a GOI (such as an amylase or SEQ. I.D. No. 2) into a construct designed for 
10 expression in filamentous fungi. 

Several types of constructs used for heterologous expression have been developed. The 
constructs contain the promoter according to the present invention (or if desired another 
promoter if the GOI codes for the enzyme according to the present invention) which is 

15 active in fungi. Examples of promoters other than that of the present invention include 
a fungal promoter for a highly expressed extracellular enzyme, such as the glucoamylase 
promoter or the a-amylase promoter. The GOI can be fused to a signal sequence (such 
as that of the present invention or another suitable sequence) which directs the protein 
encoded by the GOI to be secreted. Usually a signal sequence of fungal origin is used, 

20 such as that of the present invention. A terminator active in fungi ends the expression 
system, such as that of the present invention. 

Another type of expression system has been developed in fungi where the GOI is fused 
to a smaller or a larger part of a fungal gene encoding a stable protein. This can stabilize 

25 the protein encoded by the GOI. In such a system a cleavage site, recognized by a 
specific protease, can be introduced between the fungal protein and the protein encoded 
by the GOI, so the produced fusion protein can be cleaved at this position by the specific 
protease thus liberating the protein encoded by the GOI ("POD. By way of example, 
one can introduce a site which is recognized by a KEX-2 like peptidase found in at least 

30 some Aspergilli. Such a fusion leads to cleavage in vivo resulting in protection of the 
POI and production of POI and not a larger fusion protein. 
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Heterologous expression in Aspergillus has been reported for several genes coding for 
bacterial, fungal, vertebrate and plant proteins. The proteins can be deposited 
intracellularly if the GOI is not fused to a signal sequence. Such proteins will accumulate 
in the cytoplasm and will usually not be glycosylated which can be an advantage for some 
5 bacterial proteins. If the GOI is equipped with a signal sequence the protein will 
accumulate extracellulary. 

With regard to product stability and host strain modifications, some heterologous proteins 
are not very stable when they are secreted into the culture fluid of fungi. Most fungi 
10 produce several extracellular proteases which degrade heterologous proteins. To avoid 
this problem special fungal strains with reduced protease production have been used as 
host for heterologous production. 

For the transformation of filamentous fungi, several transformation protocols have been 
15 developed for many filamentous fungi (Ballanee 1991, ibid). Many of them are based 
on preparation of protoplasts and introduction of DNA into the protoplasts using PEG and 
Ca 2+ ions. The transformed protoplasts then regenerate and the transformed fungi are 
selected using various selective markers. Among the markers used for transformation are 
a number of auxotrophic markers such as argB, trpC, niaD and pyrG, antibiotic 
20 resistance markers such as benomyl resistance, hygromycin resistance and phleomycin 
resistance. A very common used transformation marker is the amdS gene of A. nidulans 
which in high copy number allows the fungus to grow with acrylamide as the sole 
nitrogen source. 

25 Even though the enzyme, the nucleotide sequence coding for same and the promoter of 
the present invention are not disclosed in EP-B-0470145 and CA-A-2006454, those two 
documents do provide some useful background commentary on the types of techniques 
that may be employed to prepare transgenic plants according to the present invention. 
Some of these background teachings are now included in the following commentary. 

30 

The basic principle in the construction of genetically modified plants is to insert genetic 
information in the plant genome so as to obtain a stable maintenance of the inserted 
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Several techniques exist for inserting the genetic information, the two main principles 
being direct introduction of the genetic information and introduction of the genetic 
5 information by use of a vector system. A review of the general techniques may be found 
in articles by Potrykus (Annu Rev Plant Physiol Plant Mol Biol [1991] 42:205-225) and 
Christou (Agro-Food-Industry Hi-Tech March/ April 1994 17-27). 

Thus, in one aspect, the present invention relates to a vector system which carries a 
10 promoter or nucleotide sequence or construct according to the present invention and 
which is capable of introducing the promoter or nucleotide sequence or construct into the 
genome of an organism, such as a plant. 

The vector system may comprise one vector, but it can comprise two vectors. In the case 
15 of two vectors, the vector system is normally referred to as a binary vector system. 
Binary vector systems are described in further detail in Gynheung An et al. (1980), 
Binary Vectors, Plant Molecular Biology Manual A3, 1-19. 

One extensively employed system for transformation of plant cells with a given promoter 
20 or nucleotide sequence or construct is based on the use of a Ti plasmid from 
Agrobacierium tumefaciens or a Ri plasmid from Agrobacterium rhizogenes An et al 
(1986), Plant Physiol. 81, 301-305 and Butcher D.N. et al. (1980), Tissue Culture 
Methods for Plant Pathologists, eds.: D.S. Ingrams and J. P. Helgeson, 203-208. 

25 Several different Ti and Ri plasmids have been constructed which are suitable for the 
construction of the plant or plant cell constructs described above. A non-limiting 
example of such a Ti plasmid is pGV3850. 

The promoter or nucleotide sequence or construct of the present invention should 
30 preferably be inserted into the Ti-plasmid between the terminal sequences of the T-DNA 
or adjacent a T-DNA sequence so as to avoid disruption of the sequences immediately 
surrounding the T-DNA borders, as at least one of these regions appear to be essential 
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for insertion of modified T-DNA into the plant genome. 
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As will be understood from the above explanation, if the organism is a plant, then the 
vector system of the present invention is preferably one which contains the sequences 
5 necessary to infect the plant (e.g. the vir region) and at least one border pan of a T-DNA 
sequence, the border part being located on the same vector as the genetic construct. 

Furthermore, the vector system is preferably an Agrobacterium tumefaciens Ti-plasmid 
or an Agrobacterium rhizogenes Ri-plasmid or a derivative thereof, as these plasmids are 
10 well-known and widely employed in the construction of transgenic plants, many vector 
systems exist which are based on these plasmids or derivatives thereof. 

In the construction of a transgenic plant the promoter or nucleotide sequence or construct 
of the present invention may be first constructed in a microorganism in which the vector 

15 can replicate and which is easy to manipulate before insertion into the plant. An example 
of a useful microorganism is E. coli, but other microorganisms having the above 
properties may be used. When a vector of a vector system as defined above has been 
constructed in £. colt, it is transferred, if necessary, into a suitable Agrobacterium strain, 
e.g. Agrobacterium tumefaciens. The Ti-plasmid harbouring the promoter or nucleotide 

20 sequence or construct of the invention is thus preferably transferred into a suitable 
Agrobacterium strain, e.g. A. tumefaciens, so as to obtain an Agrobacterium cell 
harbouring the promoter or nucleotide sequence or construct of the invention, which 
DNA is subsequently transferred into the plant cell to be modified. 

25 As reported in CA-A-2006454, a large amount of cloning vectors are available which 
contain a replication system in E. coli and a marker which allows a selection of the 
transformed cells. The vectors contain for example pBR 322, pUC series, M13 mp 
series, pACYC 184 etc. 

30 In this way, the nucleotide or construct or promoter of the present invention can be 
introduced into a suitable restriction position in the vector. The contained plasmid is 
used for the transformation in E. coli. The E.coli cells are cultivated in a suitable nutrient 
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medium and then harvested and lysed. The plasmid is then recovered. As a method of 
analysis there is generally used sequence analysis, restriction analysis, electrophoresis and 
further biochemical-molecular biological methods. After each manipulation, the used 
DNA sequence can be restricted and connected with the next DNA sequence. Each 
5 sequence can be cloned in the same or different plasmid. 

After each introduction method of the desired promoter or construct or nucleotide 
sequence according to the present invention in the plants the presence and/or insertion of 
further DNA sequences may be necessary. If, for example, for the transformation the 

10 Ti- or Ri-plasmid of the plant cells is used, at least the right boundary and often however 
the right and the left boundary of the Ti- and Ri-plasmid T-DNA, as flanking areas of 
the introduced genes, can be connected. The use of T-DNA for the transformation of 
plant cells has been intensively studied and is described in EP-A-120516; Hoekema, in: 
The Binary Plant Vector System Offset-drukkerij Kanters B.B., Alblasserdam, 1985, 

15 Chapter V; Fraley, et al., Crit. Rev. Plant Sci., 4:1-46; and An et al., EMBO J. (1985) 
4:277-284. 

Direct infection of plant tissues by Agrobacterium is a simple technique which has been 
widely employed and which is described in Butcher D.N. et al. (1980), Tissue Culture 
20 Methods for Plant Pathologists, eds.: D.S. Ingrams and J. P. Helgeson, 203-208. For 
further teachings on this topic see Potrykus (Annu Rev Plant Physiol Plant Mol Biol 
[1991] 42:205-225) and Christou (Agro-Food-Industry Hi-Tech March/ April 1994 17-27). 
With this technique, infection of a plant may be done on a certain pan or tissue of the 
plant, i.e. on a pan of a leaf, a root, a stem or another pan of the plant. 

25 

Typically, with direct infection of plant tissues by Agrobacterium carrying the promoter 
and/or the GOI, a plant to be infected is wounded, e,g. by cutting the plant with a razor 
or puncturing the plant with a needle or rubbing the plant with an abrasive. The wound 
is then inoculated with the Agrobacterium. The inoculated plant or plant part is then 
30 grown on a suitable culture medium and allowed to develop into mature plants. 
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When plant cells are constructed, these cells may be grown and maintained in accordance 
with well-known tissue culturing methods such as by culturing the cells in a suitable 
culture medium supplied with the necessary growth factors such as amino acids, plant 
hormones, vitamins, etc. 

5 

Regeneration of the transformed cells into genetically modified plants may be 
accomplished using known methods for the regeneration of plants from cell or tissue 
cultures, for example by selecting transformed shoots using an antibiotic and by 
subculturing the shoots on a medium containing the appropriate nutrients, plant 
10 hormones, etc. 

Further teachings on plant transformation may be found in EP-A-0449375. 

In summation, the present invention provides an arabinofuranosidase enzyme having 
15 arabinoxylan degrading activity and the nucleotide sequence coding for the same. In 
addition, it provides a promoter that can control the expression of that, or another, 
nucleotide sequence. In addition it includes terminator and signal sequences for the 
same. 

20 The following sample was deposited in accordance with the Budapest Treaty at the 
recognised depositary The National Collections of Industrial and Marine Bacteria Limited 
(NCIMB) at 23 St. Machar Drive, Aberdeen, Scotland, United Kingdom, AB2 1RY on 
16 January 1995: 

25 E.coli containing plasnrid pB53.1 {i.e. E.coli DH5a- 

pB53.1}. The deposit number is NCIMB 40703. 

The present invention will now be described by way of example. 



30 



In the following Examples reference is made to the accompanying figures in which: 
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Figures 1-10 are sequences of promoters and GOIs of earlier patent applications that arc 
useful for use with the aspects of the present invention; 

Figure 11 is a plasmid map of the plasmid pB53.1, which is the subject of deposit 
5 NCIMB 40703; 

Figure 12 is a schematic diagram of deletions made to the promoter of the present 
invention; 

10 Figure 13 is a plasmid map of pXP-AMY; 

Figure 14 is a plasmid map of pXP-XssAMY; 
Figure 15 is a graph; 

15 

Figure 16 is an HP-TLC profile; 
Figure 17 is an HP-TLC profile; 
20 Figure 18 is an HPLC profile; 
Figure 19 is a viscosity plot; 
Figure 20 is an activity plot; 

25 

Figure 21 is an activity plot; and 

Figure 22 is an activity plot. 

30 The following text discusses the use of inter alia recombinant DNA techniques. General 
teachings of recombinant DNA techniques may be found in Sambrook, J. , Fntsch, E.F. , 
Maniatis T. (Editors) Molecular Cloning. A laboratory manual. Second edition. Cold 
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Spring Harbour Laboratory Press. New York 1989. 
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In these Examples, the enzyme of the present invention is sometimes referred to as AbfC. 
In addition, the promoter of the present invention is sometimes referred to as the AbfC 
5 promoter. 

Purification of the arabinofuranosidase 

Aspergillus niger 3M43 was grown in medium containing wheat bran and beet pulp. The 
10 fermentation broth was separated from the solid pan of the broth by filtration. 
Concentrated fermentation broth was loaded on a 25X1 00mm Q-SEPH AROSE 
(Pharmacia) high Performance column, equilibrated with 20 mM Tris, HC1 pH 7.5, and 
a linear gradient from 0-500 Mm NaCl was performed and fractions of the eluate was 
collected. The Arabinofuranosidase was eluted at 130-150 Mm NaCl. 

15 

The fractions containing the arabinofuranosidase were combined and desalted using a 
50x200 mm G-25 SEPHAROSE Superfine (Pharmacia). The column was eluted with 
distilled water. 

20 After desalting the enzyme was concentrated using High-Trap spin columns. Next the 
concentrated and desalted fractions were subjected to gel filtration on a 50x600 mm 
SUPERDEX 50 column. The sample was loaded and the column was eluted with 0.2 M 
Phosphate buffer pH 7.0 plus 0.2 M NaCl, and fractions of the eluate were collected. 

25 The fractions containing arabinofuranosidase were combined and desalted and 
concentrated as described above. The combined fractions were loaded on a 16X100 mm 
Phenylsepharose High Performance column (Pharmacia), equilibrated with 50 mM 
Phosphate buffer pH 6.0, containing 1.5 M (NH 4 ) 2 S0 4 . A gradient where the (NH 4 ),S0 4 
concentration was varied from 1.5 - 0 M was applied and the eluate collected in fractions. 

30 The fractions containing Arabinofuranosidase were combined. The purity of the 
arabinofuranosidase was evaluated by SDS-PAGE using the Phast system gel 
(Pharmacia). 
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Characterization 

The molecular weight of the purified arabinofuranosidase was determined by mass 
spectrometry using laser desorption technology. The MW of the arabinofuranosidase was 
5 found to be 33,270 D ± 50 D. 

The pi value was determined by use of a Broad pi Kit (Pharmacia). The 
arabinofuranosidase has a pi value of about 3.7. 

10 After SDS-PAGE analysis, treatment PAS reagent showed that the arabinofuranosidase 
was glycosylated. The PAS staining was done according to the procedure of I. Van- 
Seuningen and M. Davril (1992) Electrophoresis 13 pp 97-99. 
Activity Studies 

15 Activity of AbfC as a function of water soluble pentosan (WSP) concentrations (mg/ml) 
was determined. The results are shown in Figure 21. The results show that AbfC 
activity reached maximum at substrate concentration of 8 mg/ml WSP. 

pH Activity Studies 

20 

The effect of pH on the activity of the arabinofuranosidase of the present invention was 
investigated using water soluble pentosan (10 mg/ml) from wheat as a substrate in 50 mM 
citric acid sodium phosphate buffer. The incubation time was 15 minutes. The 
arabinofuranosidase of the present invention was observed to have a wide pH optima 
25 range of from about 2.5 to about 7.0 (see Figure 20), more especially from about 3.3 to 
about 4.6, more especially about 4. 

Temperature Activity Studies 

30 The effect of temperature on the activity of the arabinofuranosidase of the present 
invention was investigated using water soluble pentosan (10 mg/ml) from wheat as a 
substrate in 50 mM sodium acetate at a pH of 5.0. The incubation time was 15 minutes. 
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The arabinofuranosidase of the present invention was observed to have an optimal activity 
at a temperature of from about 40°C to about 60°C, more especially from about 45°C to 
about 55°C, more especially about 50°C (Figure 22) The enzyme is still active at about 
10°C and showed residual activity at 70°C and 80°C. 

5 

Amino acid sequencing of the arabinofuranosidase 

The enzyme was digested with endoproteinase Lys-C sequencing grade from Boehringer 
Mannheim using a modification of the method described by Stone & Williams 1993 
10 (Stone, K.L. and Williams, K.R. (1993). Enzymatic digestion of Proteins and HPLC 
Peptide Isolation. In : Matsudaira P. (Editor). A practical Guide to Protein and Peptide 
Purification for Microsequencing. Second Edition. Academic Press, San Diego 1993. pp 
45-73). 

15 Freeze dried /3-arabinofuranosidase (0.4 mg) was dissolved in 50 //I of 8M urea, 0.4 M 
NH 4 HC0 3 , pH 8.4. After overlay with N 2 and addition of 5 pi of 45 Mm DTT, the 
protein was denatured and reduced for 15 min at 50°C under N 2 . After cooling to RT, 
5 /xl of 100 Mm iodoacetamide was added for the cysteines to be derivatised for 15 min 
at RT in the dark under N 2 . Subsequently, 90 p\ of water and 5 Mg of endoproteinase 

20 Lys-C in 50 n\ of 50 Mm Tricine and 10 raM EDTA, pH 8.0, was added and the 
digestion was carried out for 24h at 37°C under N 2 , The resulting peptides were 
separated by reversed phase HPLC on a VYDAC C18 column (0.46 x 15 cm; 10 ^m; 
The Separations Group; California) using solvent A: 0.1 % TFA in water and solvent B: 
0.1% TFA in acetonitrtle. Selected peptides were rechromatographed on a Develosil CI 8 

25 column (0.46 x 10 cm; 3^m) using the same solvent system prior to sequencing on an 
Applied Biosy stems 476 A sequencer using pulsed-liquid fast cycles. 

The following peptide sequences were found: 



30 



SEQ I.D. No. 4 
SEQ I D. No. 5 
SEQ I.D. No. 6 
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SEQ I.D. No. 7 
SEQ I.D. No. 8 

Isolation of a PCR clone of a fragment of the gene 

5 

PCR primers were synthesised using an Applied Biosystems DNA synthesiser model 392 
In this regard, PCR primers were synthesized from one of the found peptide sequences, 
namely SEQ ID No. 5. The primers were: 

10 One primer from EMTAQA (reversed) 

SEQ ID NO. 9 GCY TGN GCN GTC ATY TC 
17 mer 64 mix 

15 One primer from MIVEAJG 

SEQ ID NO. 10 ATG ATH GTN GAR GCN ATH GG 
20 mer 288 mix 

20 PCR amplification was performed with 100 pmol of each of these primers in 100 fil 
reactions using Amplitaq polymerase (PERKIN ELMER). The following program was: 





STEP 


TEMP 




25 


1 


94°C 


2 min 




2 


94°C 


1 min 




3 


55°C 


2 min 




4 


72°C 


2 min 




5 


72°C 


5 min 


30 


6 


5°C 


SOAK 



Steps 2-4 were repeated for 40 cycles. 
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PCR reactions were performed on a PERKIN ELMER DNA Thermal Cycler. 



A 100 bp amplified fragment was isolated and cloned into a pT7-Blue T-vector, according 
to the manufacturers instructions (Novagen). 

5 

Isolation of A. niger genomic DNA 

Ig. of frozen A. niger mycelium was ground in a mortar under liquid nitrogen. Follow- 
ing evaporation of the nitrogen cover, the ground mycelium was extracted with 15ml of 

10 an extraction buffer (lOOmM Trisflcl, pH 8.0, O.50mM EDTA, 500mM NaCl, lOmM 
fi-mercaptoethanol) containing 1ml 20% sodium dodecyl sulphate. After incubation at 
65°C for 10 min. 5ml 5M KAc. pH 5.0, was added and the mixture further incubated, 
after mixing, on ice for 20 mins. After extraction, the mixture was centrifuged for 20 
mins. and the supernatant mixed with 0.6 vol. isopropanol to precipitate the extracted 

15 DNA. After further centrifugation for 15 mins. the DNA pellet was dissolved in 0.7 ml 
TE (lOmM Tris, HC1 pH 8.0, ImM EDTA) and precipitated with 75 pi 3M NaAc, pH 
4.8, and 500 fil isopropanol. 

After centrifugation the pellet was washed with 70% ETOH and dried under vacuum. 
20 The DNA was dissolved in 200 TE and stored at -20°C. 

Construction of a library 

20 jig genomic DNA was partly digested with Tsp509I, which gives ends which are 
25 compatible with EcoRl ends. The digested DNA was separated on a I % agarose gel and 
fragments of 4-10 kb was purified. A XZAPII EcoRLIClAP kit from Stratagene was used 
for library construction according to the manufacturers instructions. 2 jtl of the ligation 
(totally 5 jil) was packed with Gigapack Gold II packing extract from Stratagene. The 
library contained 650,000 independent clones. 

30 
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Screening of the library 

2 X 50,000 pfu was plated on NZY plates and plaquelifts were done on Hybond N sheets 
(Amersham). Plaquelifts were done in duplicates. The sheets were hybridized with the 
5 PCR clone labelled with 32 P dCTP (Amersham) using Ready-to-go labelling kit from 
Pharmacia Positive clones were reckoned only when hybridization was detected on both 
sheets. The gene was sequenced, and the found sequence showed that all of the peptides 
sequenced were coded by the found sequence. 

10 Sequence information 

SEQ. ID. No. 12 presents the promoter sequence, the enzyme coding sequence, the 
terminator sequence and the signal sequence and the amino acid sequence of the enzyme 
of the present invention. 

15 

Arabinofuranosidase assay 

Two different arabinoxylan preparations from wheat flour. Wheat Insoluble Pentosan 
(WIP) and Wheat soluble Pentosan (WSP), were degraded with the arabinofuranosidase 

20 enzyme of the present invention alone and in combination with an endoxylanase purified 
from /4. niger. The assays were done on 1 % substrate in 50 Mm 50 Mm Na-acetate 
buffer at pH 5.0. The reactions were performed at 30 °C for 2.5 hours. The reactions 
were stopped by addition of 3 vol. ethanol which precipitates the high molecular weight 
material The samples were centrifiiged and the supernatants were collected, dried under 

25 vacuum and resuspended in 0.5 ml distilled water. The samples were diluted 1:1 in 
water and analysed on a Chromopack Carbohydrate Pb column (300X7.8 mm, cat. 
29010) using Shimadzu C-R4A Chromatopac HPLC system using a Shimadzu RI D-6A 
refractive index detector in accordance with the suppliers instructions. 

30 The column was calibrated using a standard composed of 0.48 mg/ml xylotriose, 0.48 
mg/ml xylobiose, 0.60 mg/ml xylose and 0.58 rag/ml L-arabinose. The peaks were 
identified and quantified using the software supplied with the equipment. 
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Results - Liberated saccharides frpm Wheat Insoluble Pentosan 

Substrate 1 % WIP in 50 Mm Na-acetate buffer pH 5.0. Values are expressed in mg/ml. 





xylotriose 


xylobiose 


xylose 


arabinose 


no enzyme 


0.0 


0.0 


0.0 


0.0 


abfC 


0.0 


0.0 


0.0 


0.11 


xyl 


0.09 


0.14 


0.0 


0.0 


abfC + xyl 


0.37 


0.41 


0.0 


0.30 



abfC denotes the enzyme according to the present invention; and xyl denotes the xylanase 
described before. 

Results - Saccharides liberated from Wheat Soluble Pentosan 

Substrate 1 % WSP in 50 Mm Na-acetate buffer pH 5.0. Values are expressed in 
mg/ml. 





xylotriose 


xylobiose 


xylose 


arabinose 


no enzyme 


0.0 


0.0 


0.0 


0.0 


abfC 


0.0 


0.0 


0.0 


0.30 


xyl 


0.08 


0.14 


0.0 


0.0 


abfC + xyl 


0.42 


0.47 


0.0 


0.42 



abfC denotes the enzyme according to the present invention; and xyl denotes the xylanase 
described before. 
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Figure 17 shows HP-TLC profiles of the AbfC enzyme acting synergistically with 
Xylanase A. In this Figure, the following abbreviations are used: water-soluble pentosan 
(WSP); water- insoluble pentosan (WIP); and oat xylan as substrate. The standards were: 
X- xylose; X 2 - xylobiose; X 3 - xylotriose; A- arabinose. 

5 

Figure 18 shows the HPLC analysis of hydrolysis products using 1% oat spelt xylan as 
the substrate. Figure 18(a) and Figure 18(b) show the products when the AbfC enzyme 
and the xylanase enzyme respectively were used alone. Figure 18(c) show the products 
when the AbfC enzyme and the xylanase enzyme when combined. 

10 

The results of these experiments provide two important findings. 

First the enzyme of the present invention liberates arabinose, in particular L-arabinose, 
from arabinoxylan. 

15 

Second the combined actions of the enzyme according to the present invention with the 
endoxylanase is significantly higher than the sum of their individual action. Accordingly, 
the two enzymes affect each others enzymatic activities in a synergistic fashion. 

20 Induction of the AbfC gene: Identification of inducers 

The regulation of transcription of the AbfC encoding gene of Aspergillus niger was 
studied using a strain containing a fusion of the AbfC promoter to the ^-glucuronidase 
encoding gene (uid A) of E coli. 

25 

GUS producing transformants were grown on different carbon sources and assayed both 
qualitatively and quantitatively for the ability to hydrolyse p-nitrophenol glucuronide. 

The results are shown below: 



30 
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CARBON SOURCE GUS ACTIVITY AFTER 24 HOURS INDUCTION 



(units/mg) 



xylose 


12.37 


xyhtol 


1.49 


arabinose 


6.66 


arabitol 


5.30 


glucose 


0.70 


cellubiose 


0.95 


xylo-oligomer 70 


17.26 


glucopyranoside 


0.40 


methyl-xylopyranoside 


24.20 


xyloglucan 


1.00 


pectin 


0.27 


arabinogalactan 


2.60 


arabitol + glucose 


2.20 



The results show that the AbfC promoter is switched on after 24 hours when grown in 
the presence of xylose, xylo-oligomer 70, methyl-xylopyranoside, arabinose and arabitol . 
20 These studies also suggest that methyl-xylopyranoside is the natural and strongest inducer 
of this promoter. 

The AbfC promoter is strongly repressed by glucose and is therefore under carbon 
catabolite repression. However, unlike all the published promoters for 
25 arabinofuranosidases, which are induced by arabinose and arabitol, the AbfC promoter 
of the present invention is regulated strongly by the intermediates in xylose metabolism. 
Accordingly, the present invention also covers an arabinofuranosidase promoter wherein 
the promoter is inducible by an intermediate in xylose metabolism. 
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Effects of different promoter deletions on the regulation of the expression of the 
AbfC gene 

To study the regulation at the molecular level, experiments were set up to detect possible 
5 upstream regulating sequences required for expression of the AbfC gene. A series of 
plasmids with deletions in the 5* upstream region of the gene was constructed (see Figure 
12). The E coli uid A gene was used as the reporter gene and a qualitative GUS assay 
was performed. 

10 The results indicated that the truncated AbfC promoter of 590 bp contains sufficient 
information for the inducibility of the AbfC gene and its regulation. Deletion of 100 bps 
sequence from the Xma 111 to the BamHl sites of the promoter led to a reduction in 
activity of this promoter. Therefore, this 100 bps area is important for good levels of 
gene expression. Deletion of 290 bps before the ATG identified this region to be 

15 important but not sufficient to abolish the activity of this promoter All the transformants 
analysed containing this promoter construct showed very pale blue when tested ( + - 
GUS). This region is as follows: 

-170 TCATCCAATAT 

20 

As seen, this region contains the CCAAT element and is a putative target for a general 
transcriptional activator. This sequence is similar to the nuclear protein binding sites 
found in two starch inducible promoters: the Aspergillus niger glucoamylase gene and the 
Aspergillus oryzae amylase gene as well as the amdS gene of Aspergillus nidulans. 



25 
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HETEROLOGOUS PROTEIN PRODUCTION USING ASPERGULUS NIGER 
TRANSFORMED WITH THE AbfC PROMOTER AND THE AbfC SIGNAL 
SEQUENCE 

5 Transformation of Aspergillus Niger 

The protocol for transformation of A. niger was based on the teachings of Buxton. F.P , 
Gwynne D.I. . Davis,R.W. 1985 (Transformation of Aspergillus niger using the argB gene 
of Aspergillus nidulans. Gene 37:207-214), Daboussi,M.J., Djeballi.A., Gerlinger, C, 
10 Blaiseau, P.L., Cassan, M., Lebrun, M.H., Parisot, D., Brygoo,Y. 1989 
(Transformation of seven species of filamentous fungi using the nitrate reductase gene of 
Aspergillus nidulans. Curr. Genet. 15:453-456) and Punt, PJ., van den Homdel, 
C. A.M.J. J. 1992 (Transformation of filamentous fungi based on hygromycin B and 
Phleomycin resistance markers. Meth. Enzym. 216:447-457). 

15 

For the purification of protoplasts, spores from one PDA (Potato Dextrose Agar - from 
Difco Lab. Detroit) plate of fresh sporulated N400 (CBS 120.49, Centraalbureau voor 
Schimmelcultures, Baarn) (7 days old) are washed off in 5-10 ml water. A shake flask 
with 200 ml PDC (Potato Dextrose Broth, Difco 0549-17-9, Difco Lab. Detroit) is 
20 inoculated with this spore suspension and shaken (250 rpm) for 16-20 hours at 30 °C. 

The mycelium is harvested using Miracloth paper and 3-4 g wet mycelium are transferred 
to a sterile petri dish with 10 ml STC (1.2 M sorbitol, 10 mM Tris Hcl pH 7,5, 50 Mm 
CaC! 2 ) with 75 mg lysing enzymes (Sigma L-2265) and 4500 units lyticase (Sigma L- 
25 8012). 

The mycelium is incubated with the enzyme until the mycelium is degraded and the 
protoplasts are released. The degraded mycelium is then filtered through a sterile 60 /xm 
mesh filter. The protoplasts are harvested by centrifugation 10 rnin at 2000 rpm in a 
30 swing out rotor. The supernatant is discarded and the pellet is dissolved in 8 ml 1.5 M 
MgS0 4 , and then centrifiiged at 3000 rpm for 10 min. 
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The upper band, containing the protoplasts is transferred to another tube, using a transfer 
pipette and 2 ml 0.6 M KC1 is added. Carefully 5 ml 30% sucrose is added on the top 
and the tube is centrifuged 15 min at 3000 rpm. 

5 The protoplasts, lying in the interface band, are transferred to a new tube and diluted 
with 1 vol. STC. The solution is centrifuged 10 min at 3000 rpm. The pellet is washed 
twice with STC, and finally solubilized in 1 ml STC. The protoplasts are counted and 
eventually concentrated before transformation. 

10 For the transformation, 100 /xl protoplast solution (10M0 7 protoplasts) are mixed with 
10 /il DNA solution containing 5- 10 pig DNA and incubated 25 min at room 
temperature. Then 60 % PEG-4000 is carefully added in portions of 200 m'» 200 \A and 
800 fil The mixture is incubated 20 min at room temperature. 3 ml STC is added to the 
mixture and carefully mixed. The mixture is centrifuged 3000 rpm for 10 min. 

15 

The supernatant is removed and the protoplasts are solubilized in the remaining of the 
supernatant. 3-5 ml topagarose is added and the protoplasts are quickly spread on 
selective plates. 

20 AbfC promoter and heterologous gene expression 

The expression vector pXP-Amy (Figure 13) contains the 2.1 kb a-amylase encoding 
gene from Thermomyces lanuginosus cloned downstream of the AbfC promoter (2. 1 kb) 
and upstream of the Xylanase A terminator. This vector together with the hygromycin 
25 gene as a selectable marker was used for co-transformation experiments to test the 
functionality of the AbfC promoter. 

The best transformant was accumulated in shake flask experiments at least 1 gram per 
litre of a-amylase in the culture media. Starch degrading activity was then detected 
30 within 48 hours and a peak of enzyme activity is observed at 4 days of growth on sugar 
beet pulp and wheat bran (Figure 15). 
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AbfC signal sequence functions in protein secretion 

An expression construct containing the signal peptide of the AbfC gene translationally 
fused to the mature a-amylase from T. lanuginosus was prepared and expression of this 
5 construct in the production strains was observed. In this regard, the translational fusion 
construct pXPXss-Amy (Figure 14) was placed under the transcriptional control of the 
AbfC promoter and the xylanase A termination signal. The incorporation of an 
endogenous signal peptide resulted in increased detectability of co-transformants 
expressing both amylase and the hygromycin resistance marker. The endogenous signal 
10 peptide directed the secretion of amylase out of the cell. 

Substrate Specificity of AbfC Protein 

The substrate specificity of the purified AbfC was determined using arabinose containing 
15 hemicelluloses: arabinoxylans from wheat, oat and larch, branched and debranched 
arabinans; arabinogalactan, sugar beet pectin, and xyloglucan. 

The HPLC and HP-TLC results are shown in Figure 16, in which the following 
abbreviations are used: WSP - water-soluble pentosan, WIP - water-insoluble pentosan, 
20 AG - arabinogalactan, deB-A - debranched arabinan. The standards used were: A- 
arabinose, X- xylose. 

The results indicate that arabinose is the hydrolysis product from arabinoxylans. No 
hydrolysis products were released from arabinogalactan, debranched arabinan or 
25 xyloglucan. Arabinose was released as a hydrolysis product from branched arabinan. 
AbfC is therefore a 1,2/1,2 debranching enzyme and it has no activity towards linear 1,5 
a-linked L-arabinofuranose residues found in debranched arabinans and arabinogalactan. 
This enzyme also releases a product when pectin is used as the substrate. It is believed 
that this product is an arabinose containing feruiic acid or an arabinobiose. 

30 
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The results for the substrate specificity studies also suggest that the enzyme of the present 
invention could be used to reduce the viscosity of feeds. In this regard, the enzyme 
5 would reduce the viscosity of branched substrates by removing the branches but not the 
backbone of that substrate. This is in contrast to the known viscosity modifiers which 
degrade the substrate backbone. 

Accordingly, the present invention covers a process of reducing the viscosity of a 
10 branched substrate wherein the enzyme degrades the branches of the substrate but not the 
backbone of the substrate. 

In particular, the present invention covers the use of the enzyme of the present invention 
as a viscosity modifier. 

15 

In this regard, an experiment was carried out to investigate the reduction of viscosity of 
the water-soluble pentosan fraction from wheat flour by arabinofuranosidase. In this 
experiment, 6 ml water-soluble pentosan was incubated with 100 pi of AbfC for 20 
hours, 20°C at pH 5.5. 

20 

The results (see Figure 19) show that the enzyme of the present invention can be used 
to reduce the viscosity of pectins, especially pectins that are used in beverages - such as 
fruit juices. 

25 Accordingly, the present invention covers the use of the enzyme of the present invention 
to reduce the viscosity of pectin. 

ANTIBODY PRODUCTION 

30 Antibodies were raised against the enzyme of the present invention by injecting rabbits 
with the purified enzyme and isolating the immunoglobulins from antiserum according 
to procedures described according to N Harboe and A Ingild ("Immunization, Isolation 
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of Immunoglobulins, Estimation of Antibody Titre" In A Manual of Quantitative 
Immunoelectrophoresis, Methods and Applications, N H Axelsen, et al (eds.). 
Universitetsforlaget, Oslo, 1973) and by T G Cooper ("The Tools of Biochemistry" , John 
Wiley & Sons, New York, 1977). 

5 

SUMMARY 

Even though it is known \hai Aspergillus niger produces arabinofuranosidases, the present 
invention provides a novel and inventive arabinofuranosidase, as well as the coding 
10 sequence therefor and the promoter for that sequence. An important advantage of the 
present invention is that the enzyme can be produced in high amounts. 

In addition, the promoter and the regulatory sequences (such as the signal sequence and 
the terminator) can be used to express or can be used in the expression of GOIs in 
15 organisms, such as in A niger. 

The arabinofuranosidase of the present invention is different from the 
arabinofuranosidases previously known. In this regard, the previous described 
arabinofuranosidases - such as those of EP-A-0506190 - are characterised by their ability 
20 to degrade arabinan, and are assayed using p-nitrophenyl-arabinoside. 

The arabinofuranosidase of the present invention does not degrade arabinan, and only a 
minor activity is seen on p-nitrophenyl-arabinoside, 

25 In contrast, the arabinofuranosidase of the present invention is useful for degrading 
arabinoxylan. Therefore, the arabinofuranosidase of the present invention is quite 
different from the previous isolated arabinofuranosidases. 

More in particular, the enzyme of the present invention is capable of specifically cleaving 
30 arabinose from the xylose backbone of arabinoxylan. 
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The enzyme of the present invention is useful as it can improve processes for preparing 
foodstuffs and feeds as well as the foodstuffs and feeds themselves. For example, the 
enzyme of the present invention may be added to animal feeds which are rich in 
arabinoxylans. When added to feeds (including silage) for monogastic animals (e.g. 
5 poultry or swine) which contain cereals such as barley, wheat, maize, rye or oats or 
cereal by-products such as wheat bran or maize bran, the enzyme significantly improves 
the break-down of plant cell walls which leads to better utilization of the plant nutrients 
by the animal. As a consequence, growth rate and/or feed conversion are improved. 
Moreover, arabinoxylan-degrading enzymes may be used to reduce the viscosity of feeds 
10 containing arabinans. The arabinoxylan-degrading enzyme may be added beforehand to 
the feed or silage if pre-soaking or wet diets are preferred. 

Of particular benefit is the use of the enzyme according to the present invention in 
combination with a xylanase, especially an endoxylanase. 

15 

A possible further application for the enzyme according to the present invention is in the 
pulp and paper industry. The application of xylanases is often reported to be beneficial 
in the removal of lignins and terpenoids from the cellulose and hemicellulose residues of 
a hemicellulose backbone, an essential step in the processing of wood, wood pulp or 

20 wood derivative product for the production of paper. The addition of arabinoxylan- 
degrading enzymes, produced according to the present invention, to the xylanase 
treatment step should assist in the degradation of an arabinan-containing hemicellulose 
backbone and thus facilitate an improved, more efficient removal of both lignins and 
terpenoids. The application of arabinoxylan-degrading enzymes should be particularly 

25 advantageous in the processing of soft woods in which the hemicellulose backbone 
contains glucuronic acid. 

The enzyme according to the present invention is also useful as it acts in a synergistic 
manner with endoxylanase (see results presented above). 

30 

Other modifications of the present invention will be apparent to those skilled in the art 
without departing from the scope of the invention. 
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SEQUENCE LISTINGS 

SEO ID NO: 1 
ENZYME SEQUENCE 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 296 amino acids 

(B) TYPE, amino acid 
(D) TOPOLOGY: linear 

(n ) MOLECULE TYPE protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 



Lys Cys Ser Leu 
1 

Ser Tyr Ser Trp 
10 

Trp Thr Ala Leu 
25 

He Val Tyr Ala 
40 

Thr Phe Gly Ala 
55 

Thr Ala Thr Pro 

Pro Lys Ser He 
90 

Thr Tyr Arg Thr 
105 

Glu Lys Ala Leu 
120 

He Asp Gin Thr 
135 

Ala Gly Asp Asn 



Pro Ser 
5 

Ser Ser Thr Asp Ala Leu 
15 

Lys Asp Phe Thr Asp Val 
30 

Ser Thr Thr Asp Glu Ala 
45 

Phe Ser Glu Trp Ser Asn 
60 

Tyr Asn Ala Val Ala Pro 
75 80 
Trp Val Leu Ala Tyr Gin 
95 

Ser Gin Asp Pro Thr Asn 
110 

Phe Thr Gly Lys Leu Ser 
125 

Val He Gly Asp Asp Thr 
140 

Gly Lys He Tyr Arg Ser 
155 160 



Ala Thr Pro Lys Ser Gly 
20 

Val Ser Asp Gly Lys His 
35 

Gly Asn Tyr Gly Ser Met 
50 

Met Ala Ser Ala Ser Lys 
65 70 
Thr Leu Phe Tyr Phe Lys 
85 

Trp Gly Ser Ser Thr Phe 
100 

Trp Ser Ser 



Val Asn Gly 
115 

Asp Ser Ser 
130 

Asn Met Tyr 
145 

Ser Met Ser 



Thr Gly Ala 

Leu Phe Phe 

150 

lie Asp Glu 
165 



WO 96/29416 PCT/EP96/01009 

46 

Phe Pro Gly Ser Phe Gly Ser Gin Tyr Glu Glu lie Leu Ser Gly Ala 

170 175 180 

Thr Asn Asp Leu Phe Glu Ala Val Gin Val Tyr Thr Val Asp Gly Gly 

185 190 195 

Glu Gly Asn Ser Lys Tyr Leu Met He Val Glu Ala He Gly Ser Thr 

200 205 210 

Gly His Arg Tyr Phe Arg Ser Phe Thr Ala Ser Ser Leu Gly Gly Glu 
215 220 225 230 

Trp Thr Ala Gin Ala Ala Ser Glu Asp Lys Pro Phe Ala Ala Lys Pro 

235 240 245 

Thr Val Ala Pro Pro Gly Pro Lys Thr Leu Ala Met Val Thr Trp Phe 

250 255 260 

Ala Thr Thr Leu He Lys Pro * 
265 270 
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SEQ ID NO: 2 

NUCLEOTIDE CODING SEQUENCE 



AAA 




TfT 


CTJ 

U 1 1 


CCA 


TCG 


Tff TAT 


AGT 


TGG 


AGT TCA ACC GAT GCT 


CTC 

U I U 


GCA 


ACT 

nU 1 


cct 

UU 1 


AAP 

r\r\U 


TCA 


GGA 


TGG ACC 

i uu nuL/ 


GCA 


CTG 


AAG GAC TTT ACT GAT 


GTT 


GTP 


TPT 

1 U 1 


GAC 


pgp 

uuu 


AAA 


PAT 
Ln 1 


ATf GTC 


TAT 


GCG 


TCC ACT ACT GAT GAA 

I UU n\U 1 i yn I unn 


GCG 

UUU 


gga 


AAP 


TAT 


pgp 


TCP 


ATP 

r\ 1 U 


APf TTT 


GGC 


GCT 

UU 1 


TTC TCA GAG TGG TCP 


AAC 


ATG 

r\ 1 U 


gpa 


TPT 

1 U 1 


gpt 


Apr 

nut 


AAP 


APA G(T 


ACC 

rH_U 


rcr 


TAC AAT GCC GTG GCT 

P r\\~, nn 1 UUU UlU I 


CCT 

UU 1 


Arr 


PTG 
U 1 U 


TTf 
i i u 


TAP 


TTf 

1 1 U 


AAP 


PPG AAA 

LLu nnn 


AGC 


ATC 


TGG GTT PTG GPP TAC 

1 UU U 1 1 L lU UUL 1 ru. 


PAA 

Lnn 


TPP 


pgp 


TCP 

1 uu 


app 

r\UU 


APA 


TTP 

1 1 u 


APf TAP 


PGP 


ACC 

nut 


TCC PAA GAT CCC ACC 

1 UU Lnn ur. 1 uuu rSuL 


AAT 


ptp 

u 1 u 




GPP 


TPP 


TPP 


TPP 


PAP AAP 


PPG 


PTT 
u I » 


TTf APP PGA AAA PTC 

1 1 L ttuu UUn HMM u 1 L 


APf 
nuu 


GAC 


TCA 


AGC 


ACC 


GGT 


GCC 


*TT PAP 

ATT GAC 


CAG 


ACG 


r>rr> att* r^^r* pat oat 

GTG ATT GGC GAC GAT 


a rr 

ACG 


AAT 


ATG 


TAT 


CTC 


TTC 


TTT 


GCT GGC 


GAC 


AAC 


GGC AAG ATC TAC CGA 


TCC 


AGC 


ATG 


TCC 


ATC 


GAT 


GAA 


TTT CCC 


GGA 


AGC 


TTC GGC AGC CAG TAC 


GAG 


GAA 


ATT 


CTG 


AGT 


GGT 


GCC 


ACC AAC 


GAC 


CTA 


TTC GAG GCG GTC CAA 


GTG 


TAC 


ACG 


GTT 


GAC 


GGC 


GGC 


GAG GGC 


AAC 


AGC 


AAG TAC CTC ATG ATC 


GTT 


GAG 


GCG 


ATC 


GGG 


TCC 


ACT 


GGA CAT 


CGT 


TAT 


TTC CGC TCC TTC ACG 


GCC 


AGC 


AGT 


CTC 


GGT 


GGA 


GAG 


TGG ACA 


GCC 


CAG 


GCG GCA AGT GAG GAT 


AAA 


CCC 


TTC 


GCA 


GCA 


AAG 


CCA 


ACA GTG 


GCG 


CCA 


CCT GGA CCG AAG ACA 


TTA 


GCC 


ATG 


GTG 


ACT 


TGG 


TTC 


GCA ACA 


ACC 


CTG 


ATC AAA CCA TGA 
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SE.Q ID NO: 3 
PROMOTER SEQUENCE 

CTGCAGAAGA TGGCAGTCGC CACAGCCGAT CACCCGATCC ATACTGGATG TTGTAACTTG 60 

GAGACAGCCT GCAGATGCTC TGATGAAGGT CTGCAAATAG TTCCTGGACC TCGATA6TGA 120 

AGTATACCGA TTCGTCAATG TTGTATATCC AGCCACTTTG AAAGTACCAA CTTrTAGTTC 180 

GATTGATCAG AATACTTTTG GTGTGTAACA TTGACAAGCC AAATTATCAA TCTCTTCTAC 240 

CGGTAAGGTG TCAACTACCC GGCCGAAAGT ACCGGAAGGT CGTGGTGTTT TAAGGTGAAA 300 

CAACTATCAG GGCGGCAATG TGTCAAAGTA GAACCAGTTT GCTTAGCGCC ATTAGGATCC 360 

ACGCCTAGAC CCTTGATGCC CGGGAGTTAT CCGTCCTGTC ACAGCAATTA TTTCCCCGAG 420 

TCTACTGCCG AAGAACAGCC ATTGTGGCGT ACTCACGGAA TTACCCACTG TGTAGGGTAG 480 

TCTTGAACGC CGTTCTAGAC ACGGCAACGC TCCGGTGGAC GATCGTTTCT GGCTAATGTA 540 

CTCCGTAGTT TAGGCAGCAT GCTGATCATC TTCCCCCTAG GGAAAGGCCC CTGAATAGTG 600 

CGCCAAAATG AGCTTGAGCA AAGGAATGTT CTTTCTAAGC CAAAGTGAGG GAAATAACCA 660 

AGCAGCCCAC TTTTATCCGA AACGTTTCTG GTGTCATCCA ATATGGATAA ATCCCGATTG 720 

TTCTTCTGCA CATATCTCTA TTGTCATAAG TGCAACTACA TATATTTGAA CATGGTTTGG 780 

TCCTCTTTCC AAGTTATTCG TTCTCCGTGA CCAGCGATTT CAGCCATTGA TTCTTTTGTT 840 
TCTTTCCCCG CGGATAAACT CATACGAAG 
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INFORMATION FOR SEQ 10 NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: ?0 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(v) FRAGMENT TYPE: N- terminal 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 4: 

Lys Cys Ser Leu Pro Ser Ser Tyr Ser Trp Ser Ser Thr Asp Ala Leu 
15 10 15 

Ala Thr Pro Lys 
20 

INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS. single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Tyr Leu Met He Val Glu Ala He Gly Ser Thr Gly His Arg Tyr Phe 
15 10 15 

Arg Ser Phe Thr Ala Ser Ser Leu Gly Gly Glu Met Thr Ala Gin Ala 

20 25 30 

Ala Ser Glu Asp Lys Pro Phe Xaa Gly 
35 40 
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IMF0RMATI0N FOR SEQ ID NO: 6: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Ser He Trp Val Leu Ala Tyr Gin Trp Gly Ser Ser Thr Phe Thr Tyr 
15 10 15 

Arg Thr Ser Gin Asp Pro Thr Asn Val 
20 25 

INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Asp He Val Tyr Ala Ser Thr Thr Asp Glu Ala Gly Asn Tyr Gly Ser 
15 10 15 

Met Thr Phe Gly Ala Phe Ser Glu Xaa Ser Asn Met Ala Ser 

20 25 30 
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INFORMATION FOR SEQ ID NO: 8: 

0) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(n) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

He Tyr Arg Ser Ser Met Ser He Asp Glu Phe Pro Gly Ser Phe Gly 
15 10 15 

Ser Gin Tyr Glu Glu He Leu Ser Gly Ala Thr Asn Asp Leu Phe Glu 

20 25 30 

Ala Val Gin Val Tyr Thr Val Asp Gly 
35 40 



INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(n) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "OLIGONUCLEOTIDE" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GCYTGNGCNG TCATYTC 17 



INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "OLIGONUCLEOTIDE" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
ATG ATH GTN GAR GCN ATH GG 20 
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INFORMATION FOR SEQ ID NO: 11: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 89 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PCR fragment" 

(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 11: 
ATGATTGTGG AGGCGATCGG GTCCACTGGA CATCGTTATT TCCGCTCCTT CACGGCCAGC 60 
AGTCTCGGTG GAGAGATGAC CGCACAGGC 89 

INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2555 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Aspergillus niger 

(B) STRAIN: 3M43 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 870.. 1757 
(ix) FEATURE: 

(A) NAME/KEY: sig_peptide 

(B) LOCATION: 870. .947 
(ix) FEATURE: 

(A) NAME/ KEY: mat_peptide 

(B) LOCATION: 948. .1754 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

CTGCAGAAGA TGGCAGTCGC CACAGCCGAT CACCCGATCC ATACTGGATG TTGTAACTTG 60 

GAGACAGCCT GCAGATGCTC TGATGAAGGT CTGCAAATAG TTCCTGGACC TCGATAGTGA 120 

AGTATACCGA TTCGTCAATG TTGTATATCC AGCCACTTTG AAAGTACCAA CTTTTAGTTC 180 

GATTGATCAG AATACTTTTG GTGTGTAACA TTGACAAGCC AAATTATCAA TCTCTTCTAC 240 
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CGGTAAGGTG TCAACTACCC GGCCGAAAGT ACCGGAAGGT CGTGGTGTTT TAAGGTGAAA 300 
CAACTATCAG GGCGGCAATG TGTCAAAGTA GAACCAGTTT GCTTAGCGCC ATTAG6ATCC 360 
AC6CCTAGAC CCTTGATGCC CGGGAGTTAT CCGTCCTGTC ACAGCAATTA TTTCCCCGAG 420 
TCTACTGCCG AAGAACAGCC ATTGTGGCGT ACTCACGGAA TTACCCACTG TGTAGGGTAG 480 
TCTTGAACGC CGTTCTAGAC ACGGCAACGC TCCGGTGGAC GATCGTTTCT GGCTAATGTA. 540 
CTCCGTAGTT TAGGCAGCA F GCTGATCATC TTCCCCCTAG GGAAAGGCCC CTGAATAGTG 600 
CGCCAAAATG AGCTTGAGCA AAGGAATGTT CTTTCTAAGC CAAAGTGAGG GAAATAACCA 660 
AGCAGCCCAC TTTTATCCGA AACGTTTCTG GTGTCATCCA ATATGGATAA ATCCCGATTG 720 
TTCTTCTGCA CATATCTCTA TTGTCATAAG TGCAACTACA TATATTTGAA CATGGTTTGG 780 
TCCTCTTTCC AAGTTATTCG TTCTCCGTGA CCAGCGATTT CAGCCATTGA TTCTTTTGTT 840 
TCTTTCCCCG CGGATAAACT CATACGAAG ATG AAG TTC TTC AAT GCC AAA GGC 893 

Met Lys Phe Phe Asn Ala Lys Gly 
-26 -25 -20 
AGC TTG CTG TCA TCA GGA ATC TAC CTC An GCA TTA ACC CCC TTT GTT 941 
Ser Leu Leu Ser Ser Gly lie Tyr Leu lie Ala Leu Thr Pro Phe Val 

-15 -10 -5 

AAC GCC AAA TGC TCT CTT CCA TCG TCC TAT AGT TGG AGT TCA ACC GAT 989 
Asn Ala Lys Cys Ser Leu Pro Ser Ser Tyr Ser Trp Ser Ser Thr Asp 

1 5 10 

GCT CTC GCA ACT CCT AAG TCA GGA TGG ACC GCA CTG AAG GAC TTT ACT 1037 
Ala Leu Ala Thr Pro Lys Ser Gly Trp Thr Ala Leu Lys Asp Phe Thr 
15 20 25 30 

GAT GTT GTC TCT GAC GGC AAA CAT ATC GTC TAT GCG TCC ACT ACT GAT 1085 
Asp Val Val Ser Asp Gly Lys His He Val Tyr Ala Ser Thr Thr Asp 

35 40 45 

GAA GCG GGA AAC TAT GGC TCG ATG ACC TTT GGC GCT TTC TCA GAG TGG 1133 
Glu Ala Gly Asn Tyr Gly Ser Met Thr Phe Gly Ala Phe Ser Glu Trp 

50 55 60 

TCG AAC ATG GCA TCT GCT AGC AAG ACA GCC ACC CCC TAC AAT GCC GTG 1181 
Ser Asn Met Ala Ser Ala Ser Lys Thr Ala Thr Pro Tyr Asn Ala Val 

65 70 75 

GCT CCT ACC CTG TTC TAC TTC AAG CCG AAA AGC ATC TGG GTT CTG GCC 1229 
Ala Pro Thr Leu Phe Tyr Phe Lys Pro Lys Ser lie Trp Val Leu Ala 

80 85 90 

TAC CAA TGG GGC TCC AGC ACA TTC ACC TAC CGC ACC TCC CAA GAT CCC 1277 
Tyr Gin Trp Gly Ser Ser Thr Phe Thr Tyr Arg Thr Ser Gin Asp Pro 
95 100 105 110 
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ACC AAT GTC AAC GGC TGG TCG TCG GAG AAG GCG CTT TTC ACC GGA AAA 1325 
Thr Asn val Asn Gly Trp Ser Ser 61 u Lys Ala Leu Phe Thr Gly Lys 

L15 120 125 

CTC AGC GAC TCA AGC ACC GGT GCC An GAC CAG ACS GTG ATT GGC GAC 1373 
Leu Ser Asp Ser Ser Thr Gly Ala He Asp Gin Thr Val He Gly Asp 

130 135 140 

GAT ACG AAT ATG TAT CTC TTC TTT GCT GGC GAC AAC GGC AAG ATC TAC 1421 
Asp Thr Asn Met Tyr Leu Phe Phe Ala Gly Asp Asn Gly Lys He Tyr 

145 150 155 

CGA TCC AGC ATG TCC ATC GAT GAA TTT CCC GGA AGC TTC GGC AGC CAG 1469 
Arg Ser Ser Met Ser He Asp Glu Phe Pro Gly Ser Phe Gly Ser Gin 

160 165 170 

TAC GAG GAA ATT CTG AGT GGT GCC ACC AAC GAC CTA TTC GAG GCG GTC 1517 
Tyr Glu Glu He Leu Ser Gly Ala Thr Asn Asp Leu Phe Glu Ala Val 
175 180 185 190 

CAA GTG TAC ACG GTT GAC GGC GGC GAG GGC AAC AGC AAG TAC CTC ATG 1565 
Gin Val Tyr Thr Val Asp Gly Gly Glu Gly Asn Ser Lys Tyr Leu Met 

195 200 205 

ATC GTT GAG GCG ATC GGG TCC ACT GGA CAT CGT TAT TTC CGC TCC TTC 1613 
He Val Glu Ala He Gly Ser Thr Gly His Arg Tyr Phe Arg Ser Phe 

210 215 220 

ACG GCC AGC AGT CTC GGT GGA GAG TGG ACA GCC CAG GCG GCA AGT GAG 1661 
Thr Ala Ser Ser Leu Gly Gly Glu Trp Thr Ala Gin Ala Ala Ser Glu 

225 230 235 

GAT AAA CCC TTC GCA GCA AAG CCA ACA GTG GCG CCA CCT GGA CCG AAG 1709 
Asp Lys Pro Phe Ala Ala Lys Pro Thr Val Ala Pro Pro Gly Pro Lys 

240 245 250 

ACA TTA GCC ATG GTG ACT TGG TTC GCA ACA ACC CTG ATC AAA CCA TGA 1757 
Thr Leu Ala Met Val Thr Trp Phe Ala Thr Thr Leu He Lys Pro * 
255 260 265 270 

CTGTCGATCC TTGCAACCTC CAGTTGCTCT ATCAGGGCCA TGACCCCCAA CAGCAGTGGC 1817 
GACTACAACC TCTTGCCATG GAAGCCGGGC GTCCTTACCT TGAAGCAGTG ACGAGCTTAT 1877 
CTTTAGTTGC AGATCGTGTT TCTCCTTTCT TCTTCAAGTA GTTTTAGTGG TGGAAGACAG 1937 
CAGAAGGTGG TCATCATCTT AGGCTCAGTT GGGGTGGGCT CCTGCCACGT TTTGTCCATA 1997 
GGCTAGTAAT TTGCACGGAA TTCAGTTCAT TGGCAAGGAG TGCGGTACGA ATACCTGTTT 2057 
TCACAATAGC AATTAGGCCC AGTAGTTATA CTACGTACTG GAATTGAGTA CTCGTAGTAG 2117 
CAAGATTGTT TGCCTCAGAG GGAATGGCCG ACACGTGAGC AAGTCACCTT CATCAGCTAG 2177 
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TCGCGTTCCA CATAGACAAT GGTCCAGCTC CAGAGTGGAA TTTGGGCTAC TTTGAACGAT 2237 

GGCCGATTGA ATCGCGCGTC TCCTCAATTG TATTTAACCA CAATAGGCCA GGTATTGGCA 2297 

TTCACTCTCC GCCTTTGCGG GTGCCGGCAC GAGATGTCTC CTGAAGAAAC TAGGCAACGA 2357 

GCAGACTGTG GATATGGGAG ATGGTTGACG ATGTGCTTCT TGGTAAATTT GAAGCCTCCA 2417 

GGGCCTCTAG AAAGGCGGGA ATTTAAA1CT CAAGTGCCCT AACGTGTCCG ACCACGGTGT 2477 

TGATCATCAT TCATTGAATC GGATAACAGT CTTGGTTCGG AAACTGAACA GGCGGCTCTT 2537 

GAATGACACT CTGGATCC 2555 



(2) INFORMATION FOR SEQ ID NO: 13: 
TERMINATOR SEQUENCE 



CTGTCGATCC 


TTGCAACCTC 


CAGTTGCTCT 


ATCAGGGCCA 


TGACCCCCAA 


CAGCAGTGGC 


60 


GACTACAACC 


TCTTGCCATG 


GAAGCCGGGC 


GTCCTTACCT 


TGAAGCAGTG 


ACGAGCTTAT 


120 


CTTTAGTTGC 


AGATCGTGTT 


TCTCCTTTCT 


TCTTCAAGTA 


Gl 1 1 IAGTGG 


TGGAAGACAG 


180 


CAGAAGGTGG 


TCATCATCTT 


AGGCTCAGTT 


GGGGTGGGCT 


CCTGCCACGT 


TTTGTCCATA 


240 


GGCTAGTAAT 


TTGCACGGAA 


TTCAGTTCAT 


TGGCAAGGAG 


TGCGGTACGA 


ATACCTGTTT 


300 


TCACAATAGC 


AATTAGGCCC 


AGTAGTTATA 


CTACGTACTG 


GAATTGAGTA 


CTCGTAGTAG 


360 


CAAGATTGTT 


TGCCTCAGAG 


GGAATGGCCG 


ACACGTGAGC 


AAGTCACCTT 


CATCAGCTAG 


420 


TCGCGTTCCA 


CATAGACAAT 


GGTCCAGCTC 


CAGAGTGGAA 


TTTGGGCTAC 


TTTGAACGAT 


480 


GGCCGATTGA 


ATCGCGCGTC 


TCCTCAATTG 


TATTTAACCA 


CAATAGGCCA 


GGTATTGGCA 


540 


TTCACTCTCC 


GCCTTTGCGG 


GTGCCGGCAC 


GAGATGTCTC 


CTGAAGAAAC 


TAGGCAACGA 


600 


GCAGACTGTG 


GATATGGGAG 


ATGGTTGACG 


ATGTGCTTCT 


TGGTAAATTT 


GAAGCCTCCA 


660 


GGGCCTCTAG 


AAAGGCGGGA 


ATTTAAATCT 


CAAGTGCCCT 


AACGTGTCCG 


ACCACGGTGT 


720 


TGATCATCAT 


TCATTGAATC 


GGATAACAGT 


CTTGGTTCGG 


AAACTGAACA 


GGCGGCTCTT 


780 


GAATGACACT 


CTGGATCC 










798 



(2) INFORMATION FOR SEQ ID NO: 14 
Signal SEQUENCE 

ATG AAG TTC TTC AAT GCC AAA GGC AGC TTG CTG TCA TCA GGA ATC TAC 48 
CTC ATT GCA TTA ACC CCC TTT GTT AAC GCC 78 
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SEQ ID NO: 15 

SIGNAL SEQUENCE 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 ammo acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



Met Lys Phe Phe Asn Ala Lys Gly Ser Leu 
Leu Ser Ser Gly He Tyr Leu He Ala Leu 
Thr Pro Phe Val Asn Ala 



10 
20 
26 
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1 


The National Collections of Industrial and 


i 

Marine Bacteria Limited (JiCIMB) 
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23 St. Machar Drive 
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United Kingdom 
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Accession Numoer 
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In respect of those designations in which a European patent is sought, and any 
other designated state having equivalent legislation, a sample of the deposited 
microorganism will be made available until the publication of the mention c: the 
grant of the European patent or until the date on which the application has been 
refused or withdrawn or is deemed to be withdrawn , only by the issue cf sucn 2 
sample to an exoert nominated by the person requesting the samoie. (Rule I3(«*) 
EPC). ' 
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1 An enzyme that is obtainable from Aspergillus, wherein the enzyme has the 
following characteristics: 

a. aMWof 33,270 D ± 50 D 

b. a pi value of about 3.7 

c. arabinoxylan degrading activity 

d. a pH optima of from about 2.5 to about 7.0 (more especially from about 
3.3 to about 4.6, more especially about 4) 

e. a temperature optima of from about 40°C to about 60°C (more especially 
from about 45°C to about 55°C, more especially about 50°C); 

wherein the enzyme is capable of cleaving arabinose from the xylose backbone of an 
arabinoxylan. 

2. An enzyme having the sequence shown as SEQ. I D. No. 1 or a variant, 
homologue or fragment thereof. 

3. An enzyme coded by the nucleotide sequence shown as SEQ. I D. No. 2 or a 
variant , homologue or fragment thereof or a sequence complementary thereto. 

4. A nucleotide sequence coding for the enzyme according to claim 1. 

5. A nucleotide sequence coding for the enzyme according to claim 2. 

6. A nucleotide sequence having the sequence shown as SEQ. I.D. No. 2 or a 
variant, homologue or fragment thereof or a sequence complementary thereto. 

7. A nucleotide sequence according to any one of claims 4 to 6 operatively linked 
to a promoter. 



WO 96/29416 PCT/EP96/01009 

59 

8. A nucleotide sequence according to claim 7 wherein the promoter is the promoter 
having the sequence shown as SEQ. I D. No. 3 or a variant, homologue or fragment 
thereof or a sequence complementary thereto. 

5 9. A promoter having the sequence shown as SEQ. I.D. No. 3 or a variant, 
homologue or fragment thereof or a sequence complementary thereto. 

10. A promoter according to claim 9 operatively linked to a GOl. 

10 11. A promoter according to claim 10 wherein the promoter is operatively linked to 
a GOL wherein the GOI comprises a nucleotide sequence according to any one of claims 
4-6. 

12. A terminator having the nucleotide sequence shown as SEQ. I.D. No. 13 or a 
15 variant, homologue or fragment thereof or a sequence complementary thereto. 

13. A signal sequence having the nucleotide sequence shown as SEQ. I.D. No. 14 or 
a variant, homologue or fragment thereof or a sequence complementary thereto. 

20 14. A construct comprising or expressing the invention according to any one of claims 
1 to 13. 

15. A vector comprising or expressing the invention of any one of claims 1 to 14. 

25 16. A plasmid comprising or expressing the invention of any one of claims 1 to 15. 

17. A transgenic organism comprising or expressing the invention according to any 
one of claims 1 to 16. 



30 



18. A transgenic organism according to claim 17 wherein the organism is a fungus. 
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19. A transgenic organism according to claim 18 wherein the organism is a 
filamentous fungus, preferably of the genus Aspergillus. 

20. A transgenic organism according to claim 17 wherein the organism is a plant. 

5 

21. A process of preparing an enzyme according to any one of claims 1 to 3 
comprising expressing a nucleotide sequence according to any one of claims 4-8. 

22. A process according to claim 21 wherein the enzyme has the sequence shown as 
10 SEQ. I.D. No. 1 or a variant, homologue or fragment thereof, and the nucleotide 

sequence has the sequence shown as SEQ. I.D. No. 2 or a variant, homologue or 
fragment thereof or a sequence complementary thereto. 

23. A process according to claim 21 or claim 22 wherein the expression is controlled 
15 (partially or completely) by use of a promoter according to claim 9. 

24. A process for expressing a GOI by use of a promoter, wherein the promoter is 
the promoter according to claim 9. 

20 25. Use of an enzyme according to any one of claims 1 to 3 or prepared by a process 
according to any one of claims 21 to 24 to degrade an arabinoxylan. 

26. Use according to claim 24 wherein the enzyme is used in combination with a 
xylanase, preferably an endoxylanase. 

25 

27. A combination of enzymes to degrade an arabinoxylan, the combination 
comprising an enzyme according to any one of claims 1 to 3 or prepared by a process 
according to any one of claims 21 to 24 claims; and a xylanase. 

30 28. Plasmid NCIMB 40703, or a nucleotide sequence obtainable therefrom for 
expressing an enzyme capable of degrading arabinoxylan or for controlling the expression 
thereof or for controlling the expression of another GOI. 
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29. A signal sequence having the sequence shown as SEQ. I.D. No. 15 or a variant, 
homologue or fragment thereof. 

30. The use of the enzyme according to any one of claims I to 3 or prepared by a 
5 process according to any one of claims 21 to 24 claims, in the manufacture of a 

medicament or foodstuff to reduce or prevent indigestion and/or increase nutrient 
absorption. 



10 



31. An arabinofuranosidase enzyme having arabinoxylan degrading activity, which is 
immunologically reactive with an antibody raised against a purified arabinofuranosidase 
enzyme having the sequence shown as SEQ. LD. No. 1. 
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FIGURE 1 



AMY 637 PROMOTER 

SEQUENCE TYPE: Nucleotide 

MOLECULE TYPE: DNA 

ORIGINAL SOURCE: Solarium Tuberosum 

SEQUENCE LENGTH: 2094 

SEQUENCE: 

10 20 30 40 

A ITAAGGGGA GCATAAGTGC AGCTCAGAAA TTCACACCTG 
50 60 70 80 

ATATTTTCCC AAAGCCCTCA AAAATGTGAA CAAATCTGCT 
90 100 110 120 

AAAATGTCAG TCAGAAGGAC TGTTCTTTTA GGTTTTCTTC 

130 140 150 160 

TCTCGAGTCA CGAAATCAGA TAATATGATA AGAAATTATG 

170 180 190 200 

GAGGATTTAT AATGTATCTG TCTGTTCTTA GGTATAATTA 

210 220 230 240 

TGTGTTCCTT TATGATGTAG TAATGGAATT CTGGGCTTAT 

250 260 270 280 

ATTAAAGGAA CTGAATATAA ATGTTCGCAT TTTAACTGCG 

290 300 310 320 

GAGACTTCGA GTTAGAGCCT TATAATTATG TCTTATCATT 

330 340 350 360 

TTATACTGAG ATCATATTAC AGATGATGAA AGCTGACATT 

370 380 390 400 

GCATTAGTTA TTCTGTTTTA TACAAGTCAT GTAACTGCTG 

410 420 430 440 

CTTGTGAGTT GTGACTGTAA GATAAATTGA TTCAGCCTTC 

450 460 470 480 

TGTGGCATTA GCGGAGATCT GATTATACTC TCATCGTCTT 

490 500 510 520 

ATCTAAGTTG CTCATGCAAC TTTGTCCTTG ATAGTTGGCT 

530 540 550 560 

AATACTACAA CTGGAATTAA GTGTAGTTAT TCGAAATCTC 

570 580 590 600 

TGTTGGAAGT TGCTAAGTGC TTAAGTGCTG GTTATTGTAA 

610 620 630 640 

ACCCCATCCG AGTTATTATA CAGCATCTGG CTGATGAAAT 

650 660 670 680 

GCTGCTCATT TGCAATGGTG ACATAACCAA ATGTTAGTAA 

690 700 710 720 

AACATACTAG CTGGTTGAAT GTTAGATGAT TGTTCAACGT 

730 740 750 760 

TACATCTCAC AGAAACCTTA TTATGGATTG ACATGTTAGT 
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770 780 790 800 

TGATCCGAM GATCCTTCTT TTAAATGCCA AAGCTTGTTA 

810 820 830 840 

CAGAiTTGAG GAGTTCTTTT ACT7TCTTTT GTTATATCTA 

850 860 870 880 

7TTCCCATTC ATTTTGACGT TCAGCCTCAC AGATGTTGTC 

890 900 910 920 

ATACTTAGAA ATGTGCGTAT ATATATAGAG AGAGAGAGAT 

930 940 950 960 

AGAGTGAAAT GATTATATAG iCGAAGATTA CGAAACTTGA 

970 980 990 1000 

CATTGAGACA TCTGTGATTG TTTGAAATTT ATGTATATAT 

1010 1020 1030 1040 

CTG7AGCATT AGAAACTATA AGAGTTGTTA GCTTCACTTG 

1050 1060 1070 1080 

TCTTATTGTT GTGCTCAAAG CAACTTCATC ATACAGTATG 

1090 1100 1110 1120 

6TTTTTATAT GCTCTTCCAT TATCACCGAA CCTTATGATT 

1130 1140 1150 1160 

ATGTGTACGA GCTTATAATA TTACTGATGG TGATTCAGTA 

1170 1180 1190 1200 

TTATGATTAT GTCCTCCATT AATTATTCTG TTTCATACAA 

1210 1220 1230 1240 

GTCGTGTAAT TTGCTGTTTG TGATTGTACG ATAAATTGAT 

1250 1260 1270 1280 

TCAACCTTCT GCGGTGTTGG TTGAAGTTCA AGTAAATTAG 

1290 1300 1310 1320 

CTTTATTTAT CATAGTAGCA TTTGATTATT GATGCTCTGT 

1330 1340 1350 1360 

AGCTAATGAT AAGCCATTGA AGGGAAGCAG AAATGGTAAA 

1370 1380 1390 1400 

GCTTTCTAAA ATGAATCTAC GAATGGATGA TAAAGTTAAT 

1410 1420 1430 1440 

GAATATTGTT GATACTTCTG CAATCAGATT ATGAGTTACT 

1450 1460 1470 1480 

GAGTCTACTG 1 1 1 1 1 IAAGC CTGTTTCAGA TGATCGATCA 

1490 1500 1510 1520 

TCAACAACAA CATATTCAGT GTAGTAGACA TGATCGATCA 

1530 1540 1550 1560 

CTTTCTAATT TTCGATTATG CACCCTCTTT TCTCCAATTT 

1570 1580 1590 1600 

GGTCGTCTTC TTTTTTTCAT GATGTCACTG AATTATTCTC 

1610 1620 1630 1640 

TGGTCGTCCC CACCATTCAG GAAGTCACTT CGAGCATAAT 

1650 1660 1670 1680 

GTGAAAACAT CCACATTTTT CAAATCCAGC AGAATTTTCA 
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1690 1700 1710 1720 

TCAAACGGGG TTCAACATTT ACTACATGTA TACACTCTGA 

1730 1740 1750 1760 

AGTCTGAATC CACTAATTCT AGATGGTGCA TCTGTGCCCC 

1770 1780 1790 1800 

CACACTTGTG AAAGCTTATT CTCAATTTTT TAT7TTCCAA 

1810 1820 1830 1840 

CAACTTGAAT TCAGACCACA CAACTCCCGT GTCTTGTACG 

1850 1860 1870 1880 

GTCAGCATCT GAGTGGAGAA CTCAATTAAG TGACT7TAAC 

1890 1900 1910 1920 

GTCGAGTTCT ATAGTAAACA ACCCCTATAT CTTTTTTCAA 

1930 1940 1950 1960 

GCATGTTAAG ATTGCGAACA CACTGAAATT TCCAGGTCGT 

1970 1980 1990 2000 

TAATCTTGTA CCCAGTGTGT GTACTTTTAA AAAAAAAAGT 

2010 2020 2030 2040 

CAGI I 1 1 1 IA GTCTCTAAAA CACATTTAAA TAGAGTTTAT 

2050 2060 2070 2080 

TTGCCATCTT TTGTTCCTCA TACTAGACTT CGGAGTCAAC 

2090 
ACAACACAAC AACA 
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FIGURE 2 

AMY 351 PROMOTER 
SEQUENCE TYPE: Nucleotide 
MOLECULE TYPE. DNA (genomic) 
ORIGINAL SOURCE: Solanum tuoerosun 
SEQUENCE LENGTH: L734 bp 
STRANOEDNESS : DouDle 
TOPOLOGY: Linear 
SEQUENCE: 

10 20 30 AO 

TCTTTAAGTT GTTTGCTTGA TTTTTCTTCT TCAATCTTCT 
50 60 70 80 

ATATTTAATT CGTTTTAGCT TCAAACTTCT TCAATTTTAT 
90 100 110 120 

TTCAATTTAA TTCTACAAAA AAAATCTCTA TTTAGCACCA 

130 140 150 160 

TTCATAAAAT TCATGCTCAA AATGGGCAAA CATAAATAAT 

170 180 190 200 

AAATGTGAAG TAAATAATGG ATTAAAATAT ATATTTTTGG 

210 220 230 240 

GCCTCACATC AACCTTCATA ATTCTTGAAT GAATGAATGA 

250 260 270 280 

TAGACTTCAT AATTTTTTAA CCTATACATA TAAGAAAATT 

290 300 310 320 

GAGAGTAACT CAAATAACAA GTTGTAGTAT CACATCTTTA 

330 340 350 360 

CTATTTGATA ACATTATGAA GGTGATTATA CATTACGTAA 

370 380 390 400 

CATTTCTTTT AAAAATATGT AAGCAAATTT ACTTTTTAAC 

410 420 430 440 

TTATCATTGA TCTTCATGGT TTTGTCATAA ATCTCAAAGT 

450 460 470 480 

TATCATATTT TATATAGCTA TTTGAAAGTA ATTTTATTTT 

490 500 510 520 

TACTCATCAT TGAGTGATGC TTTTATTATA ATACTAGTAA 

530 540 550 560 

GTTTTATTTA TTATTTTCTT TTAGGGGTGA ATTGTATAAT 

570 580 590 600 

ATAATAAAAA ATATATTTTT AGAAATAATG ATTCTTTTAT 

610 620 630 640 

TATTAAAAAG TTAAGATATT AGATTATTTA TGCTTGTATA 

650 660 670 680 

ATAATGAACG AAGTTTTATT TTCTATGAGT TTCATTAATC 

690 700 710 720 

ATGTTTGTAA TTATTTCAAA TTTTGATGTA TTTTTATAAT 

730 740 750 760 

TTTGTATTAT TATATTATTA TACTATATTT AAAAATTTAA 
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770 


780 


790 


800 


AGATCCATAG 


GGCTTACGCC 


CCACGTCAAG 


AGGCTTGCGC 


810 


820 


830 


840 


CTTTCCCTAA 


ATTAAGTAAA 


ACTCTTCGCC 


TCATGCCTTA 


850 


860 


870 


880 


CGCCTCCGCC 


TTTTAAAACA 


CTGATTCCTT 


TCCTCATATA 


890 


900 


910 


920 


GCTTGAGGCG 


AAAATATTTA 


ATAAAAACAC 


TTCTTAATTT 


930 


940 


950 


960 


GTTTATATGT 


TCAATTGAAC 


ATGTCCGTGA 


7TAGAAAATT 


970 


980 


990 


1000 


AAATTAAATT 


CAATGACAAA 


TTTAATAATT 


TGACACAAAA 


1010 


1020 


1030 


1040 


TTTATGAAAA 


AAATATCAAA 


ATATAAAGAA 


ATA 1 i i 1 1 1 1 


1050 


1060 


1070 


1080 


TGAAATGGAT 


TAAAAAGAAA 


AAAAAAACAA 


ATAAATTGAA 


1090 


1100 


1110 


1120 


CCGGGATAAG 


TTGGTTGTTT 


AATTGATTAT 


TGATTATGAT 


1130 


1140 


1150 


1160 


CTCAATTTGA 


CATTTTGCGC 


GATCTTTCGA 


CCTCAATTCG 


1170 


1180 


1190 


1200 


TATGAACTGA 


CACTACGCCA 


ATGGACAGTC 


GCCGTCGTCA 


1210 


1220 


1230 


1240 


CCGCCACCGC 


ACTATTCTCG 


ACGCGTCGTC 


TATCTCCTCC 


1250 


1260 


1270 


1280 


ACCCCACAGC 


CGTCAATTCC 


AAGCTTCCAA 


TGAACCGTTG 


1290 


1300 


1310 


1320 


CCATGTGTCA 


CTGCCTATTC 


ACCGCGAAAC 


ATGAATATCA 


1330 


1340 


1350 


1360 


CTGACGAACG 


ATTTCGGAGC 


GGAACGAATC 


CAGAAAATGG 


1370 


1380 


1390 


1400 


ATTACTTTCT 


ATAAATTCCT 


CGAATCTCAA 


CTCCATTTCG 


1410 


1420 


1430 


1440 


TAAAAATAAA 


ATTAAAAATA 


TTGTTTCTTT 


TTGTATTTCT 


1450 


1460 


1470 


1480 


TTTTGTATTT 


CTGGTTTATG 


TGGTGATCGA 


ATTTTCAATT 


1490 


1500 


1510 


1520 


IIIIIACTGG 


TAGTGATTCC 


TACTTTTCTT 


CAATTGCATT 


1530 


1540 


1550 


1560 


TCTCCI 1 1 [ 1 


CCATTTCACG 


GTTGAGAATT 


CATGATTCCT 


1570 


1580 


1590 


1600 


TATCAGAGGA 


ATCGATCCGA 


TTTGACTAAT 


TTCACTTTTC 


1610 


1620 


1630 


1640 


GTCTGTATAA 


ATACCAGAGT 


ATCTAGGTTG 


AGGAACGTAA 


1650 


1660 


1670 


1680 


TTTCAAGCTG 


CGATCGGCTT 


TTTCCCCTGA 


ACGAGCAAAC 
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1690 1700 1710 1720 

ACAGGTTGTG GGTTCGAGTT AGCAAGGGAC GTATAATCTC 
1730 

AACTACAATC CATT 
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FIGURE 3 

a-AMYLASE CODING SEQUENCE 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH. 2017 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: dOUDle 

(D) TOPOLOGY: linear 

(A) LENGTH: 475 ammo acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

ATG AAG TCT CTC GCC GCA ATT GCT GCT CTG CTG TCG CCC ACA CTG GTC 48 
Met Lys Ser Leu Ala Ala He Ala Ala Leu Leu Ser Pro Thr Leu Val 
-18 -15 -10 -5 

CGG GCA GCG ACT CCG GAT GAG TGG AAA GCT CAG TCG ATC TAT TTC ATG 96 
Arg Ala Ala Thr Pro Asp Glu Trp Lys Ala Gin Ser He Tyr Phe Met 
1 5 10 

CTG ACG GAC CGG TTT GCG CGT ACC GAC AAT TCG ACC ACG GCT CCC TGT 144 
Leu Thr Asp Arg Phe Ala Arg Thr Asp Asn Ser Thr Thr Ala Pro Cys 
15 20 ' 25 30 

GAC ACC ACT GCC GGG GTATGCAACT AACCCTGTGT TTCTCTTCCC GGGACGTACA 199 
Asp Thr Thr Ala Gly 
35 

AGGGGTCTTC TCCATGCTAA CCGTGCACAT GCAG AAA TAT TGC GGG GGA ACA 251 

Lys Tyr Cys Gly Gly Thr 
40 

TGG CGA GGT ATC ATC AAC AAC GTAAGTGGCT TCTGATTTTC GCTCAATAAT 302 
Trp Arg Gly He He Asn Asn 
45 

CTTCGTCGCG TGACTTTATT TCCTAG CTG GAT TAC ATC CAG GAT ATG GGC TTC 355 

Leu Asp Tyr He Gin Asp Met Gly Phe 
50 55 

ACA GCT ATC TGG ATA ACT CCA GTG ACA GCC CAG TGG GAC GAC GAT GTG 403 
uThr Ala He Trp He Thr' Pro Val Thr Ala Gin Trp Asp Asd Asp Val 
60 65 70 

GAT GCG GCA GAT GCA ACG TCG TAT CAC GGT TAT TGG CAG AAA GAC CT 450 
Asp Ala Ala Asp Ala Thr Ser Tyr His Gly Tyr Trp Gin Lys Asp Leu 
75 80 85 

GTGCGCAACC CTGCTCCATG GATCGCTGGC TGCAAACTCG TGCTGATCGG TGAI I 1 1 1 1 1 510 

I 1 1 1 1 1 1 n I TTGAAACAG A TAC TCT CTG AAT TCG AAA TTC GGC ACT GCC 560 

Tyr Ser Leu Asn Ser Lys Phe Gly Thr Ala 
90 95 
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GAT GAC TTG AAA GCC CTG GCT GAT GCC CTT CAC GCC CGT GGG ATG CTT 608 

Asp Asp Leu Lys Ala Leu Ala Asp Ala Leu His Ala Arg Gly Met Leu 
100 105 110 1:5 

CTC ATG GTC GAC GTC GTG GCT AAT CAC TTT GTACGGACCA TCTACATACC 558 
Leu Met Val Asp val Val Ala Asn His Phe 
120 125 

TGGGAAACGC GAAGAAGGAA AAAAAAAAAA AGGCGCACGC TAACATTTCG CGTTTAG 715 

GGC TAC GGC GGT TCT CAT AGC GAG GTG GAT TAC TCG ATC TTC AAT CCT 763 
Gly Tyr Gly Gly Ser His Ser Glu Val Asp Tyr Ser lie Phe Asn Pro 
130 135 140 

CTG AAC AGC CAG GAT TAC TTC CAC CCG TTC TGT CTC ATT GAG GAC TAC 811 
Leu Asn Ser Gin Asp Tyr Phe His Pro Phe Cys Leu He Glu Asp Tvr 
145 150 ' 155 

GAC AAC CAG GAA GAA GTC GAA CAA TGC TGG CTG GCC GAT ACT CCG ACG 859 
Asp Asn Gin Glu Glu Val Glu Gin Cys Trp Leu Ala Asp Thr Pro Thr 
160 165 170 

ACA TTG CCC GAC GTG GAC ACC ACC AAT CCT CAG GTT CGG ACG TTT TTC 907 
Thr Leu Pro Asp Val Asp Thr Thr Asn Pro Gin Val Arg Thr Phe Phe 
175 180 185 

AAC GAC TGG ATC AAG AGC CTG GTG GCG AAC TAC TCC A GTATGATTGT 954 
Asn Asp Trp He Lys Ser Leu Val Ala Asn Tyr Ser 
190 195 200 

TCCCGCGGTA ACGCTTTAGG GCTTGCTCTA ACTGAAATCG ACAG TC GAT GGT CTG 1009 

lie Asp Gly Leu 
205 



CGC GTC GAC ACC GTT AAG CAC GTG GAG AAA GAT TTC TGG CCC GAC TTC 1057 
Arg Val Asp Thr Val Lys His Val Glu Lys Asp Phe Trp Pro Asp Phe 
210 215 220 

AAC GAA GCT GCT GCG TGT ACC GTC GGC GAG GTG TTC AAC GGT GAC CCA 1105 
Asn Glu Ala Ala Ala Cys Thr Val Gly Glu Val Phe Asn Gly Asp Pro 
225 230 235 

GCG TAC ACC TGC CCA TAC CAG GAA GTG CTG GAT GGC GTT CTG AAC TAT 1153 
Ala Tyr Thr Cys Pro Tyr Gin Glu Val Leu Asp Gly Val Leu Asn Tyr 
240 245 250 

CCG AT GTGAGTGATT CCGAAAGTTC CATCGATCAG GCTTTCTGAC GCATGAGAAC 1208 
Pro He 

255 
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AGC TAC TAT CCT GCG CTT GAT GCA TTC AAG TCT GTC GGC GGC AAT CTC 1256 

Tyr Tyr Pro Aid Leu Asp Ala Phe Lys Ser Val Gly Gly Asn Leu 
260 265 270 

GGC GGC TTG GCT CAG GCC ATC ACC ACC GTG CAG GAG AGC TGC AAG GAT 1304 

Gly Giv L eu Ala Gin Ala He Thr Thr Val Gin Glu Ser Cys Lys Asp 
275 280 285 

TCC AAT CTG CTC GGC AAT TTC CTT GAG AAT CAC GAC ATT GCT CGC T^T 1352 

Ser Asn Leu Leu Gly Asn Phe Leu Glu Asn Hi s Asp He Ala Arg Phe 
290 295 300 

GCT TC GTATGGACAC TCTTTTTGAA GCCCTCATCG ATTGGGGATG CTGACACGGA 1407 
Ala Ser 



CAACAACAAC AG G TAC ACG GAT GAC CTT GCT CTC GCC AAG AAT GGT CTC 1456 
Tyr Thr Asp Asp Leu Ala Leu Ala Lys Asn Gly Leu 
305 310 315 

GCT TTC ATC ATC CTC TCG GAT GGT ATT CCG ATC ATC TAC ACG GGC CAG 1504 
Ala Phe He He Leu Ser Asp Gly He Pro He He Tyr Thr Gly Gin 
320 325 330 

GAG CAG CAC TAC GCC GGT GAT CAC GAT CCC ACA AAT CGT GAG GCC GTC 1552 
Glu Gin His Tyr Ala Gly Asp His Asp Pro Thr Asn Arg Glu Ala Val 
335 ' 340 345 

TGG CTG TCT GGC TAC AAT ACC GAC GCC GAG CTG TAC CAG TTC ATC AAG 1600 
Trp Leu Ser Gly Tyr Asn Thr Asp Ala Glu Leu Tyr Gin Pne He Lys 
350 ' 355 360 

AAG GCC AAT GGC ATC CGC AAC TTG GCT ATC AGC CAG AAC CCG GAA TTC 1648 
Lys Ala Asn Gly He Arg Asn Leu Ala lie Ser Gin Asn Pro Glu Phe 
365 ~ 370 375 380 

ACC TCC TCC AAG GTGAGTACAA TAACAAACTT TTCGAAAAAT TTTTCACCGG 1700 
Thr Ser Ser Lys 



AGAAAACCTA AGATTCGGCT AACAAAACAA AAAAAAAAAA G ACC AAG GTC ATC 1753 

Thr Lys Val He 
385 

TAC CAA GAC GAT TCG ACC CTT GCC ATT AAC CGG GGC GGC GTC GTT ACT 1801 
Tyr Gin Asp Asp Ser Thr Leu Ala He Asn Arg Gly Gly Val Val Thr 
390 395 400 

GTC CTG AGC AAT GAA GGC GCC TCC GGG GAG ACC GGG ACT GTC TCC ATT 1849 
Val Leu Ser Asn Glu Gly Ala Ser Gly Glu Thr Gly Thr Val Ser He 
405 410 ' 415 420 
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CCG GGA ACT GGC TTC GAG GCC GGC ACG GAA TTG ACT GAT GTC ATC TCC 1397 

Pro Gly Thr 31 y Phe 61 u Ala Gly Thr Gl u Leu Thr Asd Val He Ser 
J25 430 435 

IbL r/*.U rKL L3!U ''L : O^d UUU UmL A^lL buu UlU U I L LlML Li I U I ij 1 r*0 

Cys Lys Tnr val Tnr Ala Glv Asp Ser G-v A" 3 Val Asp Voi Pro Leu 
440 445 450 

ICG GGC GGA CTG CCA AGC GTG CTC TAT CCC AGC TCC CAG CTG GCC AAG 1993 

Ser Gly Gly Leu Pro Ser Val Leu Tyr Pro Ser Ser Gin Leu Ala Lys 
455 460 465 

AGT GGT CTG TGT GCG TC3 GCG TGA 2017 

Ser Gly Leu Cys Ala Ser Ala 

470 475 



WO*W»4l6 TCtlBMIOum 

: iGURE i 

a- AMYLASE CODING SEQUENCE 

SEQUENCE TYPE: Nucleotide 

MOLECULE TYPE: DNA 

ORIGINAL SOURCE: Soldnum Tuberosum 

SEQUENCE LENGTH. 1570 

SEQUENCE: 

10 20 30 40 

7GTG5TGATC GAATTTCAA i i .' \ ' I I ACT GAGTATCTAG 

50 60 70 80 

GTTGAGGAAC GTAATTTCAA GCTGCGATCG GCTTTTTCCC 
90 100 110 120 

CTGAACGAGC AAACACAGGT TGTGGGTTCG AGTTAGCAAG 

130 140 150 160 

GGACGTATAA TCTCAACTAC AATCCATTAT GGCGCTTGAT 

170 180 190 200 

GAAAGTCAGC AGTCTGATCC ATTGGTTGTG ATACGCAATG 

210 220 230 240 

GAAAGGAGAT CATATTGCAG GCATTCGACT GGGAATCTCA 

250 260 270 280 

TAAACATGAT TGGTGGCTAA ATTTAGATAC GAAAGTTCCT 

290 300 310 320 

GATATTGCAA AGTCTGGTTT CACAACTGCT TGGCTGCCTC 

330 340 350 360 

CGGTGTGTCA GTCATTGGCT CCTGAAGGTT ACCTTCCACA 

370 380 390 400 

GAACCTTTAT TCTCTCAATT CTAAATATGG TTCTGAGGAT 

410 420 430 440 

CTCTTAAAAG CTTTACTTAA TAAGATGAAG CAGTACAAAG 

450 460 470 480 

TTAGAGCGAT GGCGGACATA GTCATTAACC ACCGTGTTGG 

490 500 510 520 

GACTACTCAA GGGCATGGTG GAATGTACAA CCGCTATGAT 

530 540 550 560 

GGAATTCCTA TGTCTTGGGA TGAACATGCT ATTACATCTT 

570 580 590 600 

GCACTGGTGG AAGGGGTAAC AAAAGCACTG GAGACAACTT 

610 620 630 640 

TAATGGAGTT CCAAATATAG ATCATACACA ATCCTTTGTT 

650 660 670 680 

CGGAAAGATC TCATTGACTG GATGCGGTGG CTAAGATCCT 

690 700 710 720 

CTGTTGGCTT CCAAGATTTT CGTTTTGATT TTGCCAAAGG 

730 740 750 760 

TTATGCTTCA AAGTATGTAA AGGAATATAT CGAGGGAGCT 

770 780 790 800 

GAGCCAATAT TTGCAGTTGG AGAATACTGG GACACTTGCA 

810 820 830 840 

ATTACAAGGG CAGCAATTTG GATTACAACC AAGATAGTCA 

850 860 870 880 

CAGGCAAAGA ATCATCAATT GGATTGATGG CGCGGGACAA 

890 900 910 920 

CTTTCAACTG CATTCGATTT TACAACAAAA GCAGTCCTTC 
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930 


940 


950 


960 


AGGAAGCAGT 


CAAAGGAGAA 


i . CTGGCGTT 


TGC6TGACTC 


9^0 


980 


990 


1000 


TAAGGGGAAG 


CCCCCAGGAG 


TTTTAGGATT 


GTGGCCTTCA 


1010 


1020 


1030 


1040 


AGGGC i 3 ' 


C i -. " 1 ATTGA 


TAATCACGAC 


ACTGGATCAA 


1350 


^1060 


1070 


1080 




TTGGCC : ■ i C 


CCTTCACGTC 


ATGTTATGGA 


1090 


1100 


1110 


1120 


uGGCTATGCA 


1ACATTCT1A 


CACACCCAGG 


GATACCATCA 


1130 


1140 


1150 


1160 


GTTTTCTTTG 


ACCATTTCTA 


CGAATGGGAT 


AAiTTCCATGC 


1170 


1180 


1190 


1200 


ATGACCAAAT 


TGTAAAGCTG 


AT7GCTAT7C 


GGAGGAATCA 


1210 


1220 


1230 


1240 


AGGCATACAC 


AGCCGTTCA7 


CTATAAGAAT 


TCTTGAGGCA 


1250 


1260 


1270 


1280 


CAGCCAAACT 


TATACGCTGC 


AACCATTGAT 


GAAAAGGTTA 


1290 


1300 


1310 


1320 


GCGTGAAGAT 


TGGGGACGGA 


TCATGGAGCC 


CTGCTGGGAA 


1330 


1340 


1350 


1360 


AGAGTGGACT 


CTCGCGACCA 


GTGGCCATCG 


CTATGCAGTC 


1370 


1380 


1390 


1400 


TGGCAGAAGT 


AATCTTACAG 


CTATTCCGTT 


ACTTAATATA 


1410 


1420 


1430 


1440 


TTAGTAGAAA 


TATATATGTT 


TTAAACCCGA 


GCACCTACTT 


1450 


1460 


1470 


1480 


CTAACACTAG 


ATCCGCCTCT 


ACAGGCTTGG 


ATGGAGTGAT 


1490 


1500 


1510 


1520 


GAGIIiim 


TTCCTGTTCA 


TTAGACATTG 


CAACATGGGA 


1530 


1540 


1550 


1560 


TGTATGI 1 i 1 


GTTAATAAAA 


GTGTTCTTGA 


TCAATGCAAT 



1570 
GTAATAAGGG 
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SEQUENCE: Nucleotide sequence of a cONA encoding the targe subur.it of AQP- 

glucose pyroohosphorylase from Darlev seed endosperm (oepUO) 
SEQUENCE T V PE: NUCLEI C ACID 
MOLECULE TYPE: DNA 
ORIGINAL SOURCE: BARLEY 
SEQUENCE LENGTH: 2037 
STRANDEDNESS: DOUBLE 
TOPOLOGY : LINEAR 

1 ACGACCACCT CCGAACTCAA CGCCTCCACG GACCATCTCT 
41 CTCCTCTCCC CTCCCCTCAC CACCACCACC ACCACCACCC 
81 CTTCTCCCTC CCTGCATTTG ATTCGTTCAT ATTCATCCGT 
121 CGCTTGCCCG GTCGCCACCC CGTCGATCCC TCACCCCGCC 
161 GTCCCCGGCA GTTGCAGGTG GACTGCTAAT GTCATCGATG 
201 CAGTTCAGCA GCGTGCTGCC CCTGGAGGGC AAGGCGTGCG 
241 TTTCCCCAGT CAGGAGAGAG GGATCGGCCT GCGAGCGCCT 
281 CAAGATCGGG GACAGCAGCA GCATCAGGCA CGAGAGAGCG 
321 TCCAGGAGGA TGTGCAACGG CGGCGCAGGG GCCCCGCCGC 
361 CACCGGTGCG CAGTGCGTGC TCACCTCCGA CGCCAGCCCG 
401 GCCGACACCC TTGTTCTCCG GACGTCCTTC CGGAGGAATT 

ACGCCGATCC GAACGAGGTC GCGGCCGTCG GTCGCGGCCG 

TCATACTCGG CGGCGGCACC GGGACTCAGC TCTTCCCGCT 

CACAAGCACA AGGGCCACAC CTGCTGTTCC TATTGGAGGA 

TGTTACAGGC TCATCGATAT TCCCATGAGC AACTGCTTCA 
601 ACAGTGGCAT CAACAAGATA TTCGTCATGA CCCAGTTCAA 

CTCGGCATCT CTCAATCGCC ACATTCACCG CACCTACCTC 

GGCGGGGGAA TCAATTTCAC TGATGGATCT GTTGAGGTAT 

TGGCCGCGAC ACAAATGCCT GGGGAGGCTG CTGGATGGTT 

CCGCGGAACA GCGGATGCCG TCAGAAAATT TATCTGGGTG 
801 CTTGAGGACT ACTATAAGCA TAAATCCATA GAGCACATTT 

TGATCTTGTC GGGCGATCAG CTTTATCGCA TGGATTACAT 

GGAGCTTGTG CAGAAACATG TGGATGACAA TGCTGACATT 

ACTTTATCAT GTGCCCCTGT TGGAGAGAGC CGGGCATCTG 

AGTACGGGCT AGTGAAGTTC GACAGTTCAG GCCGTGTGAT 
1001 CCAGTTTTCT GAGAAGCCAA AGGGCGACGA TCTGGAAGCG 

ATGAAAGTGG ATACCAGTTT TCTCAATTTC GCCATAGACG 

ACCCTGCTAA ATATCCATAC ATTGCTTCGA TGGGAGTTTA 

TGTCTTCAAG AGAGATGTTC TGCTGAACCT TCTAAAGTCA 

AGATACGCAG AACTACATGA CTTTGGGTCT GAAATCCTCC 
1201 CGAGAGCTCT GCATGATCAC AATGTACAGG CATATGTCTT 

CACTGACTAC TGGGAGGACA TTGGAACAAT CAGATCCTTC 

TTCGATGCGA ACATGGCCCT CTGCGAACAG CCTCCAAAGT 

TTGAATTTTA TGATCCAAAA ACCCCCTTCT TCACTTCGCC 

TCGGTACTTA CCGCCAACAA AGTCAGACAA GTGCAGGATC 
1401 AAAGAAGCGA TCATTTCGCA CGGCTGCTTC TTGCGTGAAT 

GCAAAATCGA GCACTCCATC ATCGGCGTTC GTTCACGCCT 

AAACTCCGGA AGCGAGCTCA AGAACGCGAT GATGATGGGC 

GCGGACTCGT ACGAGACCGA GGACGAGATC TCGAGGCTGA 

TGTCTGAGGG CAAGGTTCCC ATCGGCGTCG GGGAGAACAC 
1601 AAAGATCAGC AACTGCATCA TCGACATGAA CGCGAGGATA 
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GGAAGGGACG TGGTCATCTC AAACAAGGAG GGGGTGCAAG 

AAGCCGACAG GCCGGAGGAA GGGTACTACA TCAGGTCCGG 

GATCGTGGTG ATCCAGAAGA ACGCGACCAT CAAGGACGGC 

ACCGTCGTGT AGGGCGTGCC GGGTCGGCGC GACGGGGTTC 

1301 TGCGACAACC TGTGCGCTGC GTCGGTCGTC ATCATCTTCT 

CAAACTCCGG GACTGAAGAA GTGATCCGGG GACGGGAGAC 

GTTTGAAGCT TGAATGACTG AGACTGAAAG TGAAGGCGCA 

GCAGAGGCAG GCAGCATTAG TAGTAAGTAG TAAGTAAGTA 

GCAGTGGAAC AAAGTAATAG 7CGTTCG7TT TTCCCCTGTA 

2001 ATAAATAAGA GGCTGTGTGT TGAGGTAAAA AAAAAAA 
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SEQUENCE: 

SEQUENCE TYPE: 
MOLECULE TYPE: 
ORIGINAL SOURCE: 
SEQUENCE LENGTH. 
STRANDEDNESS : 
TOPOLOGY : 
COMMENT 



Nucleotide sequence of a cONA encoding the small subumt of ADP- 
glucose pyrophosDhorylase from barley seed endosperm (beDS) 
NUCLEIC ACID 
DNA 

BARLEY 
1822 
DOUBLE 
LINEAR 

The "." at 1569 denotes a ounne. 



201 



401 



601 



801 



1001 



1201 



1401 



1601 



AAAAGTGAAC 
ATTTTATATC 
AAGTTCCCTT 
CGTTTATAGT 
AATCCCCATG 
GAGGTGGTGC 
GC6TGCAAAG 
CTTATTGATA 
TATCAAAGAT 
TCTTAATCGT 
GGAGGTTACA 
CACAGCAGAG 
TGCAGATGCT 
CATAATGTTA 
TGTACCGAAT 
AGAAACGGAT 
GATGAGGAAC 
ATGAAGAAGG 
AGGAGAACAG 
CTTGGCCTTG 
TTGCTAGCAT 
GCTTCAGCTT 
TTCGGAAGTG 
TGAGGGTACA 
TATTGGTACA 
ATTACCAAAA 
GTTCTGCTCC 
TTCAAAGGTT 
GGTGAAGGAT 
CAGTAGTTGG 
AATAGAGGAC 
ACTGAAGCTG 
TTCCCATTGG 
AATCATTGAC 
ATAATCAATG 
CAGATGGATA 
CAAGGATGCT 
AGATGTGAAA 
AGTCTGGAAT 
AATAAAAA.G 
TTCCCCCCTT 



TCACACATCA 
CCTCGGTGAT 
GCCCTCCCCT 
CATAAGAGCT 
CTATTGATAG 
AGGGACTAGA 
CCTGCAGTGC 
TTCCTGTCAG 
CTATGTGCTT 
CATCTCTCAC 
AGAATGAAGG 
CCCAGATAAC 
GTAAGGCAGT 
TG6AGTATCT 
GGACTATGAA 
GCTGATATTA 
GTGCAACTGC 
GAGGATAATT 
TTGAAAGCTA 
AAGATGCGAG 
GGGTATCTAT 
CTCCGTGAGC 
AAGTTATCCC 
AGCATACCTA 
ATTGAGGCAT 
AACCAATACC 
CATTTACACA 
CTTGATGCTG 
GTGTTATTAA 
ACTCCGTTCC 
ACGTTGCTAA 
ATAAGAAACT 
TATTGGAAAG 
AAGAATGCTC 
TTGACAATGT 
CTTCATCAAA 
TTACTCCCTA 
TGTATGCCAA 
CAACCAACAA 
GAGTGCCATG 
GATGTATTAG 



CTCAATATCT 
GGATGTACCT 
TCCAAGCATG 
CATCGAAGCA 
TGTTCTCGGT 
TTGTATCCCC 
CATTGGGTGC 
TAATTGTCTG 
ACACAGTTCA 
GAGCCTATGG 
ATTTGTTGAA 
CCTGACTGGT 
ACTTGTGGCT 
AATTCTTGCT 
AAGTTTATTC 
CTGTTGCTGC 
ATTTGGCCTT 
GAATTCGCAG 
TGATGGTTGA 
GGCAAAGGAA 
GTTATTAGCA 
AATTTCCTGG 
TGGTGCAACT 
TACGACGGTT 
TCTATAATGC 
TGATTTCAGT 
CAACCTCGAC 
ATGTGACAGA 
AAACTGCAAG 
TGCATATCTG 
TGGGTGCGGA 
CCTTGCTGAA 
AATTCACACA 
GTATTGGAGA 
TCAAGAAGCG 
AGTGGCATCG 
GTGGAACAGT 
AAGACAGGGC 
GGCCGCGAAG 
CGAGTCACTT 
GAACTGTGAT 



ATATCCTTCC 
TTGGCATCTA 
AACAATGCAA 
TGCAGATCTC 
ATCATTCTTG 
TGACGAAGAA 
CAACTACAGG 
AACAGCAACA 
ACTCAGCTTC 
GAGCAACATT 
GTCCTTGCTG 
TCCAGGGTAC 
ATTCGAGGAG 
GGAGATCACC 
AGGCACACAG 
CTTGCCCATG 
ATGAAAATCG 
AGAAACCAAA 
TACGACCATA 
ATGCCTTATA 
AACATGTGAT 
AGCTAATGAC 
AGCACTGGCA 
ACTGGGAAGA 
AAATTTGGGA 
TTCTATGACC 
ACTTGCCTCC 
CAGTGTAATT 
ATACACCATT 
AAGGTGCAAT 
CTACTATGAG 
AAAGGTGGCA 
TCAAAAGAGC 
TAACGTGATG 
GCGAGGGAGA 
TAACTGTGAT 
CATATGAAGC 
TACTTGCGTC 
GAGATCATAA 
CTACACCCTT 
GTACAAGCAA 
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CTGTGATGCA CTTACGCGAA GTGCCCCTGG ATTCAGCTTT 

CTCTTTGCTT GTAACTGGTT TCCAGCAGAC CATGCTATTT 

GTTGTATGGT TCGTGCAAAA CCTTGCGATG CTTTATATAT 

GCTTTATATA 7AAACAAGAT GAATCCCCGC GCGTT3CTGC 
2001 GGCACAAAAA AAAAAAAAAA AA 
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FIGURE 7 

a-GLUCAN LYASE CODING SEQUENCE 

SEQUENCE TYPE: NUCLEIC ACID 

MOLECULE TYPE: DMA (GENOMIC) 

ORIGINAL SOURCE: FUNGAL LY INFECTED ALGAE 

SEQUENCE LENGTH: 3267 BP 

STRANDE3NESS : DOUBLE 

SEQUENCE: 

10 20 30 40 50 50 

it iii' 

! I _■. _ 1 : 

: ATGl i 1 iCAA CCCTTGCGi i TGTCGCACCT AGTGCGCTGG GAGCCAGTAC vi iCGTAGGG 
61 GCGGAGGTCA GGTCAAATGT TCGTATCCAT TCCGCTuTC CAGCTGTGCA CACAGCTACT 
121 CGCAAAACCA ATCGCCTCAA TGTATCCATG ACCGCATTGT CCGACAAACA AACGGCTACT 
181 GCGGGTAGTA CAGACAATCC GGACGGTATC GACTACAAGA CCTACGATTA CGTCGGAGTA 
241 TGGGGTTTCA GCCCCCTCTC CAACACGAAC TGGTTTGCTG CCGGCTCTTC TACCCCGGGT 
301 GGCATCACTG ATTGGACGGC TACAATGAAT GTCAACTTCG ACCGTATCGA CAATCCGTCC 
361 ATCACTGTCC AGCATCCCGT TCAGGTTCAG GTCACGTCAT ACAACAACAA CAGCTACAGG 
421 GTTCGCTTCA ACCCTGATGG CCCTATTCGT GATGTGACTC GTGGGCCTAT CCTCAAGCAG 
481 CAACTAGATT GGATTCGAAC GCAGGAGCTG TCAGAGGGAT GTGATCCCGG AATGACTTTC 
541 ACATCAGAAG GTTTCTTGAC TTTTGAGACC AAGGATCTAA GCGTCATCAT CTACGGAAAT 
601 TTCAAGACCA GAGTTACGAG AAAGTCTGAC GGCAAGGTCA TCATGGAAAA TGATGAAGTT 
661 GGAACTGCAT CGTCCGGGAA CAAGTGCCGG GGATTGATGT TCGTTGATAG ATTATACGGT 
721 AACGCTATCG CTTCCGTCAA CAAGAACTTC CGCAACGACG CGGTCAAGCA GGAGGGATTC 
781 TATGGTGCAG GTGAAGTCAA CTGTAAGTAC CAGGACACCT ACATCTTAGA ACGCACTGGA 
841 ATCGCCATGA CAAATTACAA CTACGATAAC TTGAACTATA ACCAGTGGGA CCTTAGACCT 
901 CCGCATCATG ATGGTGCCCT CAACCCAGAC TATTATATTC CAATGTACTA CGCAGCACCT 
961 TGGTTGATCG TTAATGGATG CGCCGGTACT TCGGAGCAGT ACTCGTATGG ATGGTTCATG 
1021 GACAATGTCT CTCAATCTTA CATGAATACT GGAGATACTA CCTGGAATTC TGGACAAGAG 
1081 GACCTGGCAT ACATGGGCGC GCAGTATGGA CCATTTGACC AACATTTTGT TTACGGTGCT 
1141 GGGGGTGGGA TGGAATGTGT GGTCACAGCG TTCTCTCTTC TACAAGGCAA GGAGTTCGAG 
1201 AACCAAGTTC TCAACAAACG TTCAGTAATG CCTCCGAAAT ACGTCTTTGG TTTCTTCCAG 
1261 GGTGTTTTCG GGACTTCTTC CTTGTTGAGA GCGCATATGC CAGCAGGTGA GAACAACATC 
1321 TCAGTCGAAG AAATTGTAGA AGGTTATCAA AACAACAATT TCCCTTTCGA GGGGCTCGCT 
1381 GTGGACGTGG ATATGCAAGA CAACTTGCGG GTGTTCACCA CGAAGGGCGA ATTTTGGACC 
1441 GCAAACAGGG TGGGTACTGG CGGGGATCCA AACAACCGAT CGGTTTTTGA ATGGGCACAT 
1501 GACAAAGGCC TTGTTTGTCA GACAAATATA ACTTGCTTCC TGAGGAATGA TAACGAGGGG 
1561 CAAGACTACG AGGTCAATCA GACGTTAAGG GAGAGGCAGT TGTACACGAA GAACGACTCC 
1621 CTGACGGGTA CGGATTTTGG AATGACCGAC GACGGCCCCA GCGATGCGTA CATCGGTCAT 
1681 CTGGACTATG GGGGTGGAGT AGAATGTGAT GCACTTTTCC CAGACTGGGG ACGGCCTGAC 
1741 GTGGCCGAAT GGTGGGGAAA TAACTATAAG AAACTGTTCA GCATTGGTCT CGACTTCGTC 
1801 TGGCAAGACA TGACTGTTCC AGCAATGATG CCGCACAAAA TTGGCGATGA CATCAATCTG 
1861 AAACCGGATG GGAATTGGCC GAATGCGGAC GATCCGTCCA ATGGACAATA CAACTGGAAG 
1921 ACGTACCATC CCCAAGTGCT TGTAACTGAT ATGCGTTATG AGAATCATGG TCGGGAACCG 
1981 ATGGTCACTC AACGCAACAT TCATGCGTAT ACACTGTGCG AGTCTACTAG GAAGGAAGGG 
2041 ATCGTGGAAA ACGCAGACAC TCTAACGAAG TTCCGCCGTA GCTACATTAT CAGTCGTGGT 
2101 GGTTACATTG GTAACCAGCA TTTCGGGGGT ATGTGGGTGG GAGACAACTC TACTACATCA 
2161 AACTACATCC AAATGATGAT TGCCAACAAT ATTAACATGA ATATGTCTTG CTTGCCTCTC 
2221 GTCGGCTCCG ACATTGGAGG ATTCACCTCA TACGACAATG AGAATCAGCG AACGCCGTGT 
2281 ACCGGGGACT TGATGGTGAG GTATGTGCAG GCGGGCTGCC TGTTGCCGTG GTTCAGGAA.C 
2341 CACTATGATA GGTGGATCGA GTCCAAGGAC CACGGAAAGG ACTACCAGGA GCTGTACATG 
2401 TATCCGAATG AAATGGATAC GTTGAGGAAG TTCGTTGAAT TCCGTTATCG CTGGCAGGAA 
2461 GTGTTGTACA CGGCCATGTA CCAGAATGCG GCTTTCGGAA AGCCGATTAT CAAGGCTGCT 
2521 TCGATGTACA ATAACGACTC AAACGTTCGC AGGGCGCAGA ACGATCATTT CCTTCTTGGT 
2581 GGACATGATG GATATCGCAT TCTGTGCGCG CCTGTTGTGT GGGAGAATTC GACCGAACGC 
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2641 GAATTGTACT TGCCCGTGCT GACCCAATGG TACAAATTCG GTCCCGACTT TGACACCAAG 

2701 CCTCTGGAAG GAGCGATGAA CGGAGGGGAC CGAATTTACA ACTACCCTGT ACCGCAAAGT 

2761 GAATCACCAA TCTTCGTGAG AGAAGGTGCG ATTCTCCCTA CCCGCTACAC GTTGAACGGT 

2821 GAAAACAAAT CATTGAACAC GTACACGGAC GAAGATCCG7 TGGTGTTTGA AGTATTCCCC 

2881 CTCGGAAACA ACCGTGCCGA CGGTATGTGT TATCTTGAT3 ATGGCGGTGT GACCACCAAT 

2941 GCTGAAGACA ATGGCAAGTT CTCTGTCGTC AAGGTGGCAG CGGAGCAGGA TGG7GGTACG 

3001 GAGACGATAA CGTTTACGAA TGATTGCTAT GAGTACGTT7 TCGGTGGACC GTTC7ACGTT 

3061 CGAGTGCGCG GCGCTCAGTC GCCGTCGAAC ATCCACGTGT C7TCTGGAGC GGGTCTCAG 

3121 GACATGAAGG TGAGCTCT3C CACTTCCAGG GCTGCGCTGT 7CAATGACGG GGAGAACGGT 

3181 GATTTCTGGG 7TGACCAGGA GACAGA7TCT CTGTGGCTGA AGTTGCCCAA CGTTG77CTC 

3241 CCGGACGCTG TGATCACAAT TACCTAA 
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a-GLUCAN LYASE CODING SEQUENCE 

SEQUENCE TYPE: NUCLEIC ACID 

MOLECULE TYPE: ONA (GENOMIC) 

ORIGINAL SOURCE: F'JNGALLY INFECTED ALGAE 

SEQUENCE LENGTH: 3276 BP 

STRANDEDNESS : DOUBLE 

SEQUENCE 

10 20 30 

1 I I 

1 ATGTATCCAA CCCTCACCTT CGTGGCGCCT 
61 GTGGGCATTT TTAGGTCACA CATTCTTATT 
121 GTGCGCAAAA GCAACCGCCT CAATGTATCC 
181 GTTACTGGAG GGAAGGACAA CCCGGACAAT 
241 GTGTGGCGCT TCGACCCCCT CAGCAATACG 
301 GGCGATATTG ACGACTGGAC GGCGACAATG 
361 TCCTTCACTC TCGAGAAACC GGTTCAGGTT 
421 AGGGTTCGCT TCAACCCTGA TGGTCCTATT 
481 CAGCAACTAA ATTGGATCCG GAAGCAGGAG 
541 TTCACAAAAG AAGGTTTCTT GAAATTTGAG 
601 AATTTTAAGA CTAGAGTTAC GAGGAAGAGG 
661 GTGCCGGCAG GATCGTTAGG GAACAAGTGC 
721 GGCACTGCCA TCGCTTCCGT TAATGAAAAT 
781 TTCTATGGTG CAGGAGAAGT AAACTGCGAG 
841 TACATCTTAG AACGAACTGG AATCGCCATG 
901 AACCAGTCAG ATCTTATTGC TCCAGGATAT 
961 TATTTTGCAG CACCTTGGGT AGTTGTTAAG 
1021 TCGTACGGAT GGTTTATGGA TAATGTCTCC 
1081 TGGAACTGTG GAGAGGAGAA CTTGGCATAC 
1141 CATTTTGTGT ATGGTGATGG AGATGGTCTT 
1201 CAAGGCAAAG AGTTTGAGAA CCAAGTTCTG 
1261 GTGTTTGGTT ACTTTCAGGG AGTCTTTGGG 
1321 GAGGGTGGTA ATAACATCTC TGTTCAAGAG 
1381 CCTTTAGAGG GGTTAGCCGT AGATGTGGAT 
1441 AAGATTGAAT TTTGGACGGC AAATAAGGTA 
1501 GTGTTTGAAT GGGCACATGA CAAAGGCCTT 
1561 AGAAACGACA ACGGCGGGGC AGATTACGAA 
1621 TACACGAAGA ATGACTCACT GACGAACACT 
1681 GATGCGTACA TTGGACATCT GGACTATGGT 
1741 GACTGGGGTC GACCGGGTGT GGCTGAATGG 
1801 ATTGGTCTGG ATTTCGTCTG GCAAGACATG 
1861 GGCGACGCAG TCGATACGAG ATCACCTTAC 
1921 GGACGATACA ATTGGAAATC TTACCATCCA 
1981 AATCATGGAA GGGAACCGAT GTTCACTCAA 
2041 TCTACGAGGA AGGAAGGGAT TGTTGCAAAT 
2101 TATATTATCA GTCGTGGAGG TTACATTGGC 
2161 GACAACTCTT CCTCCCAAAG ATACCTCCAA 
2221 ATGTCTTGCC TTCCACTAGT TGGGTCCGAC 
2281 AACGTGTGTC CCGGGGATCT AATGGTAAGA 
2341 TTCAGAAACC ACTATGGTAG GTTGGTCGAG 
2401 CTGTACATGT ACAAGGACGA GATGGCTACA 
2461 TGGCAGGAGG TGTTGTACAC TGCTATGTAC 
2521 AAG6CAGCTT CCATGTACGA CAACGACAGA 
2581 CTTCTCGGCG GACACGATGG ATATCGTATT 
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40 50 60 

AGTGCGCTAG GGGCCAGAAC TTTCACGTGT 
CATTCGGTTG TTCCAGCGGT GCGTCTAGCT 
ATGTCCGCTT TGTTCGACAA ACCGACTGCT 
ATCAATTACA CCACTTATGA CTACGTCCCT 
AACTGGTTTG CTGCCGGATC TTCCACTCCC 
AATGTGAACT TCGACCGTAT CGACAATCCA 
CAGGTCACGT CATACAAGAA CAATTGTTTC 
CGCGATGTGG ATCGTGGGCC TATCCTCCAG 
CAGTCGAAGG GGTTTGATCC TAAGATGGGC 
ACCAAGGATC TGAACGTTAT CATATATGGC 
GATGGAAAAG GGATCATGGA GAATAATGAA 
CGGGGATTGA TGTTTGTCGA CAGGTTGTAC 
TACCGCAACG ATCCCGACAG GAAAGAGGGG 
TTTTGGGACT CCGAACAAAA CAGGAACAAG 
ACAAATTACA ATTATGACAA CTATAACTAC 
CCTTCCGACC CGAACTTCTA CATTCCCATG 
GGATGCAGTG GCAACAGCGA TGAACAGTAC 
CAAACTTACA TGAATACTGG TGGTACTTCC 
ATGGGAGCAC AGTGCGGTCC ATTTGACCAA 
GAGGATGTTG TCCAAGCGTT CTCTCTTCTG 
AACAAACGTG CCGTAATGCC TCCGAAATAT 
ATTGCTTCCT TGTTGAGAGA GCAAAGACCA 
ATTGTCGAAG GTTACCAAAG CAATAACTTC 
ATGCAACAAG ATTTGCGCGT GTTCACCACG 
GGCACCGGGG GAGACTCGAA TAACAAGTCG 
GTATGTCAGA CGAATGTTAC TTGCTTCTTG 
GTCAATCAGA CATTGAGGGA GAAGGGTTTG 
AACTTCGGAA CTACCAACGA CGGGCCGAGC 
GGCGGAGGGA ATTGTGATGC ACTTTTCCCA 
TGGGGTGATA ACTACAGCAA GCTCTTCAAA 
ACAGTTCCAG CTATGATGCC ACACAAAGTT 
GGCTGGCCGA ATGAGAATGA TCCTTCGAAC 
CAAGTTCTCG TAACTGATAT GCGATATGAG 
CGCAATATGC ATGCGTACAC ACTCTGTGAA 
GCAGACACTC TAACGAAGTT CCGCCGCAGT 
AACCAGCATT TTGGAGGAAT GTGGGTTGGA 
ATGATGATCG CGAACATCGT CAACATGAAC 
ATTGGAGGTT TTACTTCGTA TGATGGACGA 
TTCGTGCAGG CGGGTTGCTT ACTACCGTGG 
GGCAAGCAAG AGGGAAAATA CTATCAAGAA 
TTGAGAAAAT TCATTGAATT CCGTTACCGC 
CAGAATGCGG CTTTCGGGAA ACCGATTATC 
AACGTTCGCG GCGCACAGGA TGACCACTTC 
TTGTGTGCAC CTGTTGTGTG GGAGAATACA 
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2641 ACCAGTC6CG ATCTGTACTT GCCTGTGCTG ACCAAATGGT ACAAATTCGG CCCTGACTAT 

27C1 GACACCAAGC GCCTGGATTC TGCGTTGGAT GGAGGGCAGA TGATTAAGAA CTATTCTGTG 

276! CCACAAAGCG ACTCTCCGAT ATTTGTGAGG GAAGGAGCTA TTCTCCCTAC CCGCTACACG 

2821 TTGGACGGTT CGAACAAGTC AATGAACACG TACACAGACA AAGACCCGTT GGTGTTTGAG 

2881 6TATTCCCTC TTGGAAACAA CCGTGCCGAC GGTAT6TGTT ATCTTGATGA TGGCGGTAT1 

2941 ACTACAGAT6 CTGAGGACCA TGGCAAATTC TCTCTTATCA ATGTCGAAGC CTACGGAAA 

3001 GGTGTTACGA CGACGATCAA GTTTGCGTAT GACACTTATC AATACGTATT "^GATGGTCCA 

3061 TTCTACGTTC GAATCCGTAA TCTTACGACT GCATCAAAAA. TTAACGTGTC "CTGGAGCG 

3121 GGTGAAGAGG ACATGACACC GACCTCTGCG ^ACTCGAGGG CAGCTTTGTT CAGTGATGGA 

3181 GGTGTTGGAG AATACTGGGC TGACAATGAT ACGTCTTCiC TGTGGATGAA GTTGCCAAAC 

3241 CTGGTTCTGC AAGACGCTGT GATTACCATT A.CGTAG 
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a-GLUCAN LYASE CODING SEQUENCE 
SEQUENCE TYPE: NUCLEIC ACID 
MOLECULE TYPE: DNA (GENOMIC) 
ORIGINAL SOURCE: FUNGUS 
SEQUENCE LENGTH: 3201 BP 
STRANDEDNESS : DOUBLE 
SEQUENCE. 
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10 


20 


30 


ATGGCAGGAT 


TTTCTGATCC 


TCTCAACTTT 


70 


80 


90 


CTAGACTGGA 


AGGGCCCTCA 


AAAAATCATT 


130 


140 


150 


AAGTTCCCCA 


AAAACTGGCA 


TGGAGTGAAC 


190 


200 


210 


G7TCAGTTCA 


TTAGGCCGTG 


CGTTTGGAGG 


250 


260 


270 


GACGAGTATG 


GTGATGAGAA 


TACGAGGACA 


310 


320 


330 


AATAAATTGG 


ATACTTATAG 


AGGTCTTACG 


370 


380 


390 


TTCTTTACCT 


TCTCATCCAA 


GGTCACCGCC 


430 


440 


450 


GTCGGCGATG 


GCCTCAGAAT 


TCACCTATGG 


490 


500 


510 


ACCTTGACCC 


CTTTGAAGGA 


TCCTTACCCC 


550 


560 


570 


GTGTCCGACA 


AGGTCGTTTG 


GCAAACGTCT 


610 


620 


630 


CAACACAAGA 


TGCTAAAGGA 


TACAGTTCTT 


670 


680 


690 


GTGGGGTGGG 


GAGAGATGGG 


AGGTATCCAG 


730 


740 


750 


TTTAACTTCG 


ACAATATGCA 


ATACCAGCAA 


790 


800 


810 


GAGCCACTGT 


ACCACTCGGA 


TCCCTTCTAT 


850 


860 


870 


AATATCACGG 


CAACCTTTAT 


CGATAACTAC 


910 


920 


930 


AACTCAGGCT 


ACATCAAGCT 


GGGAACCAGG 


970 


980 


990 


GCGGATACGG 


TCCCGGAAAT 


TGTACGACTT 


1030 


1040 


1050 


AAGCCCAGAT 


ATATTCTCGG 


GGCCCATCAA 


1090 


1100 


1110 


TTGTATTCTG 


TGGTCCAGCA 


GTACCGTGAC 


1150 


1160 


1170 


GATGTCGATG 


TTCAGGACGG 


CTTCAGAACT 


1210 


1220 


1230 


CCCAAAGAGA 


TGTTTACTAA 


CTTGAGGAAT 


1270 


1280 


1290 


CCTGTTATCA 


GCATTAACAA 


CAGAGAGGGT 



40 

TGCAAAGCAG 
100 

GGAGTAGACA 
160 

TTGAGATTCG 
220 

GTTAGATACG 
280 

ATTGTGCAAG 
340 

TGGGAAACCA 
400 

GTTGAAAAAT 
460 

AAAAGCCCTT 
520 

ATTCCAAATG 
580 

CCCAAGACAT 
640 

GACATTGTCA 
700 

TTTATGAAGG 
760 

GTCTATGCCC 
820 

CTTGATGTGA 
880 

TCTCAAATTG 
940 

TATGGTGGTA 

1000 
TATACAGGTC 

1060 
GCCTGTTATG 

1120 
TGTAAATTTC 

1180 
TTCACCACCA 

1240 
AATGGAATCA 

1300 
GGATACAGTA 



50 

AAGACTACTA 
110 

CTACTCCTCC 
170 

ATGATGGGAC 
230 

ACCCTGGTTT 
290 

ATTATATGAG 
350 

AGTGTGAGGA 
410 

CCGAGCGGAC 
470 

TCCGCATCCA 
530 

TAGCCGCAGC 
590 

TCAGAAAGAA 
650 

AACCTGGACA 
710 

AGCCAACATT 
770 

AAGGTGCTCT 
830 

ACTCCAACCC 
890 

CCATCGACTT 
950 

TCGATTGTTA 

1010 
TTGTTGGACG 

1070 
GATACCAACA 

1130 
CACTTGACGG 

1190 
ACCCACACAC 

1250 
AGTGCTCCAC 

1310 
CCCTCCTTGA 



60 

CAGTGTTGCG 
120 

AAAGA.GCACC 
180 

TTTAGGTGTG 
240 

CAAGACCTCT 
300 

TACTCTGAGT 
360 

TTCGGGAGAT 
420 

CCGCAACAAG 
480 

AGTAGTGCGC 
540 

CGAAGCCCGT 
600 

CCTGCATCCG 
660 

TGGCGAGTAT 
720 

CATGAACTAT 
780 

CGATTCTCGC 
840 

GGAGCACAAG 
900 

TGGAAAGACC 
960 

CGGTATCAGT 

1020 
TTCAAAGTTG 

1080 
GGAAAGTGAC 

1140 
GATTCACGTC 

1200 
TTTCCCTAAC 

1260 
CAATATCACT 

1320 
GGGAGTTGAC 
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1 OOP\ 


1J4U 


1 OCA 


1 OCP 


1 07PI 

1 J/U 


1 TOP 

1380 


A A A A A ATAPT 
AAAAAA I AL i 


TTA TP* A TPP A 
1 t A 1 LA 1 bbA 


PPAP AP ATAT 

LuALAbA 1 A 1 


APPP APPPA A 

AL'^bAbbbAA 


P A APTP^P A A 

LAAb 1 bubAA 


TP PPA AP^ <\T 

IbLbAAbljA 1 


1 OQf\ 


14UU 


i41u 




14JU 


1 A A A 

144U 


pttppptap a 
b 1 1 Lbb 1 ALA 


tptapt \ppp 
1 u 1 AL 1 ALbU 


TppTPPTA AT 

i bb 1 bu 1 AA 1 


A APPTTP A PP 

AAbu 1 1 bAbb 


TPPATPPTA A 

ILbAILL 1 AA 


TP A T^ TT A A T 

1 bA 1 a i i AA 1 


l a en 

140U 


14oU 


14/ u 


t /ion 


I4yu 


ibUU 


rcTcrrrr ap 
bb t LbuLLAb 


Ap 'MA A AT A 


r a « p*^ \ TP AP 
LAAL i A l bAL 


i iLuLLbLbA 


A PTTP A AP A ° 

AL i lLAALAu 


PaaaPaat^p 

LAAALAA I ^L 


1 C 1 A 


i con 
Ib<:U 


i con 
IboU 


lo4U 


1 CCA 

IbbU 


1 Z C A 

IDOU 


ppptatp atp 
LLL I A I LA I b 


b 1 bb 1 c i uAb 


PTAPPPTTAT 

L 1 ALbb II A! 


buuAALbu 1 A 


b I uLAbb I I I 


1 1 AL^uuHL 


1 C7A 


iboU 


i enn 


1 P AO 

loUU 


1 CI A 

IblU 


1 P O A 

1620 


PTPA APAPAA 

L 1 LAALAbAA 


ATP APP7TPP 

AbuAbu I 1 Lu 


1 Al L ibblbb 


PP A A TPP APT 

bLiAA I bLAb 1 


APA APTATPT 

ALAAb I A I L 1 


PTTP ~ ATAT" 

L 1 1 LuATA \\i 


loJU 


l c>i n 
I04(J 


1 CCA 

lbbU 


lboO 


1 C7A 

lo/U 


\ POA 

1680 


PPAPTP^A AT 

bbAL I buAA I 


TTPTPT^^P A 

1 iblblouLA 


APAP AT r APT 

AbALA 1 uAL 1 


Arrrr app a a 

ALLLLAbLAA 


TA A Ap AP ATP 

TulAL ALA I L 


ATA TP A A 1 r 

ATA IGG AuAL 


1 CQPl 

lbyu 


i 7f\n 
1/UU 


1 71 A 

1/10 


1 70A 

1/20 


1 70A 

1/30 


" 1 A A 

i 740 


A TP A A APP^T 

A I bAAAbbu I 


tp rrr Arrrr 
1 bLLLALLLb 


TP TAP TP P TP 

1 L i AL 1 Lb 1 L 


A P rTr Art ATT 

ALCiGAgALI 


P P P TP A P P A A 

CLG i lAClAA 


"*"A P A Tp to a r 

iGCvJuGAG 


i 7C pi 
1/bU 


i 7cn 
L/bU 


1 77A 


1 70A 

1/80 


1 7PlA 

1/90 


1 A A A 

1800 


a AAA APPTrP 

AAAAAbL 1 Lb 


P A A TTP A A A P 

LAA 1 1 bAAAL 


TTPPPPTPTA 

1 IbGGLTClC 


X A ptppt ap a 

TAlTCCTACA 


A T"/^ T A A A r" A A 

ATCTCCACAA 


Afc a a at — rrr 

AGCAACTTGG 


1810 


i Girt 
1820 


1 O O a 

1830 


1 O a A 

1840 


1 A A A 

1850 


1860 


P ATPPTPTTA 

LAIbblLi I A 


b I Lb 1 L 1 LGA 


ATPTP^TA AP 

AIL iL-olAAG 


A APA aappaa 

AACAAACGAA 


APTTA ATPPT 

ACTTCATCCT 


rrr rrr t a o a 

CGGGCGTGuA 


18/0 


T OOA 

1880 


1 AAA 

1890 


1 Artrt 

1900 


1 O 1 A 

1910 


1920 


AGTTAiGlCG 


P A P PPT A TPP 

GAGCCTATCG 


TT T ! P P TP^T 

TTTTGCTGGT 


CTCTGGACTG 


A A A A T* A A T A A 

GGGATAATGC 


A A ATA A OTO/* 

MGTAACTGa 


1930 


1940 


1APA 

1950 


1 rrr\ 

1960 


1 ATA 

1970 


1980 


bAAI 1 LTGGA 


A P A T A TPPPT 

AGAT A 1 LGGT 


PTPTPA APTT 

CTCTCAAGTT 


PITT PTPTPP 

CTTTCTCTGG 


A A ATA A ATA A 

GCCTCAATGG 


TATATA A A TA 

TGTGTGCATC 


i po a 

1990 


OA A A 

2000 


OA 1 A 

2010 


OA A A 

2020 


2030 


2040 


GCGGGGTCTG 


ATACGGuTGG 


1 1 i IGAACCC 


TACCGTGATG 


CAAATGGGGT 


CGAGGAGAAA 


inert 

2050 


oo/"* a 

2060 


OAT A 

2070 


2080 


2090 


2100 


taptptappp 

lALIGl AGCC 


P AP A PPT A PT 

CAGAGCTACT 


P A t a a P P TV a 

CATCAGGlGG 


TATACTGGTT 


A A If/* f*~rf* T T 

CATTCCTCTT 


GCCGTGGCTC 


2110 


Ol OA 

2120 


01 AA 

2130 


2140 


2150 


2160 


*r>o» A A A A TT 

AGGAACCATT 


A Trt TP A A A A A 

ATGTCAAAAA 


AA A A A A A AAA 

GGACAGGAAA 


TGGTTCCAGG 


AACCATACTC 


GTACCCCAAG 


01 *rn 

2170 


2180 


2190 


2200 


2210 


2220 


P A TPTTr AAA 

CATCTTGAAA 


rrr* A TV*/* A A A 

CCCATCCAGA 


a ataa a apa^ 

ACTCGCAGAC 


CAAGCATGGC 


^AT*Ta a A 

TCTATAAATC 


CGTTTTGGAG 


loon 

2230 


2240 


2250 


2260 


2270 


2280 


ATCTGTAGGT 


A A T A TATr>0 A 

ACTATGTGGA 


Of* 1 T' A A A T A r> 

GCTTAGATAC 


TCCCTCATCC 


AACTACTTTA 


CGACTGCATG 


2290 


OO A A 

2300 


2310 


2320 


2330 


2340 


1 I r A A A A A A A 

TTTCAAAACG 


T oto r> k f r> 

TaGTCGACGG 


TATGCCAATC 


ACCAGATCTA 


TGCTCTTGAC 


CGATACTGAG 


2350 


oo r n 

2360 


OO T A 

2370 


2380 


2390 


2400 


PATAPrti A AT 

GATACCACCT 


TrtTTrt A A AA A 

TCTTCAACGA 


A A AAA A A A A A 

GAGCCAAMG 


TTCCTCGACA 


ACCAATATAT 


GGCTGGTGAC 


o /t i a 

2410 


o a on 

2420 


O A O A 

2430 


2440 


2450 


2460 


papa "rrrTTP 

GALAnCTTG 


TTGCACCCA I 


CCTCCACAGT 


C f** AAA O AAA 

CGCAAAGAAA 


TTCCAGGCGA 


AAA /"* A P> A A 

AAACAGAGAT 


O /I "7 A 

2470 


O A O A 

2480 


2490 


or A A 

2500 


2510 


2520 


ptpta tptpp 

blCFATCTCL 


PT^TTTATP rt 

CTlTTTACCA 


CACCTGGTAC 


r r rTr k a a tt 

CCCTCAAATT 


t« A A A A A T A 

TGAGACCATG 


y-^» a f> A TO A A 

GGACGATCAA 


OC O A 

Z530 


OC ^ A 

2540 


OCC A 

2550 


OP PA 

2560 


2570 


2580 


or aPTPPPTT 

GGAb 1 CGCTT 


TGGGGAAiCC 


tata a A A AAT 

TGTCGAAGGT 


GGTAGTGTCA 


T A A A TT A T A O 

TCAATTATAC 


T^\ A O A T~T ' 

TGCTAGGATT 


2590 


OP A A 

2600 


OP1 A 

2610 


OP A A 

2620 


A^ A A 

2630 


2640 


pttcpapppp 

b 1 1 uLALLLb 


APPATTATA A 

AbbA 1 1 A 1 AA 


TPTPTTPPAP 
1 L 1 L 1 1 LLAL 


Abtu 1 uo 1 AL 


PAPTPTAPPT 

LAG 1 L I ALb 1 


TAGAGAGGu 1 


2650 


2660 


2670 


2680 


2690 


2700 


GCCATCATCC 


CGCAAATCGA 


AGTACGCCAA 


TGGACTGGCC 


AGGGGGGAGC 


CAACCGCATC 


2710 


2720 


2730 


2740 


2750 


2760 


AAGTTCAACA 


TCTACCCTGG 


AAAGGATAAG 


GAGTACTGTA 


CCTATCTTGA 


TGATGGTGTT 


2770 


2780 


2790 


2800 


2810 


2820 


AGCCGTGATA 


GTGCGCCGGA 


AGACCTCCCA 


CAGTACAAAG 


AGACCCACGA 


ACAGTCGAAG 


2830 


2840 


2850 


2860 


2870 


2880 


GTTGAAGGCG 


CGGAAATCGC 


AAAGCAGATT 


GGAAAGAAGA 


CGGGTTACAA 


CATCTCAGGA 


2890 


2900 


2910 


2920 


2930 


2940 
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FIGURE 9 CONTINUED 

ACCGACCCAG AAGCAAAGGG TTATCACCGC AAAGTTGCTG TCACACAAAC GTCAAAAGAC 

2950 2960 2970 2980 2990 3000 

AAGACGCGTA CTGTCACTAT TGAGCCAAAA CACAATGGAT ACGACCCTTC CAAAGAGGTG 

3010 3020 3030 3040 3050 3060 

GGTGATTATT ATACCATCAT TC TTTGGTAC GCACCAGGTT TCGATGGCAG CATCGTCGAT 

3070 3080 3090 3100 3110 3120 

GTGAGCAAGA CGACTGTGAA 7GT7GAGGGT GGGGTGGAGC ACCAAGTTTA TAAGAACTCC 

3130 3140 3150 3160 3170 3180 

GATTTACATA CGGTTGTTAT CGACGTGAAG GAGGTGATCG GTACCACAAA GAGCGTCAAG 

3190 3200 
ATCACATGTA CTGCCGCTTA A 
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FIGURE 10 

Q-GIUCAN LVASE COOING SEQUENCE 
SEQUENCE TYPE: NUCLEIC ACID 
MOLECULE TYPE: DNA (GENOMIC) 
ORIGINAL SOURCE r UNGUS 
SEQUENCE LENGTH 3213 BP 
STRANDEDNESS: DOUBLE 
SEQUENCE. 



_10 


20 


30 


ATGGCAoGAT 


TATCCGACCC 


TCTCAATTTC 


70 


80 


90 


AAAGGCTGGA 


GTGGCCCTCA 


GAAGATCATT 


130 


140 


150 


AAAGATCCGA 


AAAGCTGGCA 


TGC3GTAAAC 


190 


200 


210 


GTGCAATTCG 


TCAGACCCTG 


TGTTTGGAGG 


250 


260 


270 


GATGAGTACG 


GCGATGAGAA 


TACGAGGACT 


310 


320 


330 


GGAAACTTGG 


ACATTTTCAG 


AGGTCTTACG 


370 


380 


390 


TACTACACCT 


TCAAGTCCGA 


AGTCACTGCC 


430 


440 


450 


GTCGGCGACG 


GCCTCAAGAT 


TTACCTATGG 


490 


500 


510 


CTCTTGACCC 


CCCTGGTGGA 


CCCTTTCCCC 


550 


560 


570 


GTGGCCGACA 


AGGTTGTTTG 


GCAGACGTCC 


610 


620 


630 


CAGCATAAGA 


TGTTGAAGGA 


TACAGTTCTT 


670 


680 


690 


GTGGGTTGGG 


GAGAGATGGG 


AGGCATCGAG 


730 


740 


750 


TTCAACTTTG 


ACAATATGCA 


ATATCAGCAG 


790 


800 


810 


GAGCCGTTGT 


ATCACTCTGA 


TCCCTTCTAT 


850 


860 


870 


AACATTACGG 


CAACCTTTAT 


CGATAACTAC 


910 


920 


930 


AACTCAGGCT 


ACATCAAGCT 


GGGTACCAGG 


970 


980 


990 


GCGGATACGG 


TCCCGGAGAT 


TGTGCGACTT 


1030 


1040 


1050 


AAGCCCAGGT 


ATATTCTCGG 


AGCCCACCAA 


1090 


1100 


1110 


TTGCATGCTG 


TTGTTCAGCA 


GTACCGTGAC 


1150 


1160 


1170 


GATGTCGACT 


TTCAGGACAA 


TTTCAGAACG 


1210 


1220 


1230 


CCCAAAGAAA 


TGTTTACCAA 


TCTAAGGAAC 


1270 


1280 


1290 
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40 


50 


60 


TGCAAAGCAG 


AGGACTACTA 


Cue i GCTGCC 


100 


110 


120 


CGCTATGACC 


AGACCCCTCC 


TCAGGGTACA 


160 


170 


130 


CTTCCTTTCG 


ATGACGGGAC 


TATGTGTGTA 


220 


230 


240 


GTTAGATATG 


ACCCCAGTGT 


CAAGACTTCT 


280 


290 


300 


ATTGTACAAG 


ACTACATGAC 


TACTCTGGTT 


340 


350 


360 


TGGGTTTCTA 


CGTTGGAGGA 


TTCGGGCGAG 


400 


410 


420 


GTGGACGAAA 


CCGAACGGAC 


TCGAAACAAG 


460 


470 


480 


AAAAATCCCT 


TTCGCATCCA 


GGTAGTGCGT 


520 


530 


540 


ATTCCCAACG 


TAGCCAATGC 


CACAGCCCGT 


580 


590 


600 


CCGAAGACGT 


TCAGGAAAAA 


CTTGCATCCG 


640 


650 


660 


GATATTATCA 


AGCCGGGGCA 


CGGAGAGTAT 


700 


710 


720 


TTTATGAAGG 


AGCCAACATT 


CATGAATTAT 


760 


770 


780 


GTCTATGCAC 


AAGGCGCTCT 


TGATAGTCGT 


820 


830 


840 


CTCGACGTGA 


ACTCCAACCC 


AGAGCACAAG 


880 


890 


900 


TCTCAGATTG 


CCATCGACTT 


TGGGAAGACC 


940 


950 


960 


TATGGCGGTA 


TCGATTGTTA 


CGGTATCAGC 


1000 


1010 


1020 


TATACTGGAC 


TTGTTGGGCG 


TTCGAAGTTG 


1060 


1070 


1080 


GCTTGTTATG 


GATACCAGCA 


GGAAAGTGAC 


1120 


1130 


1140 


ACCAAGTTTC 


CGCTTGATGG 


GTTGCATGTC 


1180 


1190 


1200 


TTTACCACTA 


ACCCGATTAC 


GTTCCCTAAT 


1240 


1250 


1260 


AATGGAATCA 


AGTGTTCCAC 


CAACATCACC 


1300 


1310 


1320 
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FIGURE 10 CONTINUED 



CubTTATCA 


r t a tp a r a r a 

GTATCAGAGA 


Trr rr rr a at 

TlGCC^GAAT 


1 "1 A A 


1340 


1350 


A A A A A A^ A AT* 

AAAAAG i ACT 


tr a rr a ta a a 

TCATC A i GGA 


Tr a a a r a t a t 

TGACAGATAT 


.390 


i /inn 

1400 


1410 


nrrr a 4 t * at* 

GTTCG* i ACT 


C i i 1 1 I ACGu 


r^rTrrr a a a 

CdjTGGuAAC 


* a *■ a 


* 1 r n 


1470 




a r^Trr ^ a a * 

ACTTTGuAGA 


r * a r r a T*r A /> 

CAATTATGAC 


-o:0 


* r aa 

i520 


1 "" A A 

1d30 


r r r t a —■" \ -rr* 

CCCTA.vATu 


GTGGTGTGAG 


tt a r^r A T \ T 

TTACuuATAT 


* — T A 


1580 


1590 


CTTAACaoAG 


AaGAGGTTCG 


i ATCTbuTGG 


1530 


1640 


1650 


r r A r T » A a r T 

GGACTAGAGT 


TTGTATGGCA 


AGATATGACA 


1590 


1700 


1710 


ATGAAAGGGT 


TGCCCACCCu 


TCTGCTCGTC 


1750 


1760 


1770 


AAAAAGCTCu 


CAATTGAAAG 


TTGGGCTCTT 


1810 


1820 


1830 


CACGGTCTTG 


GTCGTCTTGA 


GTCTCGTAAG 


1870 


1880 


1890 


AGTTACGCCG 


GTGCCTATCG 


1 1 1 IGCTGGT 


1930 


1940 


1950 


GAATTCTGGA 


AGATTTCGGT 


CTCCCAAGTT 


1990 


2000 


2010 


GCGGGGTCTG 


ATACGGGTGG 


TTTTGAGCCC 


2050 


2060 


2070 


TGCAGTCCGG 


AGCTACTCAT 


CAGGTGGTAT 


2110 


2120 


2130 


AACCACTACG 


TCAAGAAGGA 


CAGGAAATGG 


2170 


2180 


2190 


CTTGAAACCC 


ATCCAGAGCT 


CGCAGATCAA 


2230 


2240 


2250 


TGCAGATACT 


GGGTAGAGCT 


AAGATATTCC 


2290 


2300 


2310 


CAAAACGTGG 


TCGATGGTAT 


GCCACTTGCC 


2350 


2360 


2370 


ACGACCTTCT 


TCAATGAGAG 


CCAAAAGTTC 


2410 


2420 


2430 


ATCCTTGTAG 


CACCCATCCT 


CCACAGCCGT 


2470 


2480 


2490 


TATCTCCCTC 


TATTCCACAC 


CTGGTACCCC 


2530 


2540 


2550 


A A~r A A *T T *T' a /■» 

GTCGCTTTAG 


GGMTCCTGT 


CGAAGGTGGC 


2590 


2600 


2610 


GCCCCAGAGG 


ATTATAATCT 


CTTCCACAAC 


2650 


2660 


2670 


ATCATTCCGC 


AAATTCAGGT 


ACGCCAGTGG 


2710 


2720 


2730 


TTCAATATCT 


ACCCTGGAAA 


GGACAAGGAG 


2770 


2780 


2790 


CGCGATAGTG 


CACCAGATGA 


CCTCCCGCAG 


2830 


2840 


2850 
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A A AT A A A r T A 

GGGTACAGTA 


rAAT^A n TA A 

CCCTuAATGA 


A^r A T A Tr A 

GouATATGA i 


1 AAA 

1360 


1 A "T A 

1370 


1380 


a r r A a r r ^ r a 

ACCGAGudGA 


A A A rt^i" A ^ A 

CAAGTubGuA 


r^r«^r\ a * a 

^^v^ulAAAA i 


1 /» AA 

1420 


T AAA 

1430 


1440 


rrrrTT^ a r^ 

CCGGTTuAGu 


r T AAAAATAA 

TTAACCCTAA 


TA A t^ " ' ' r ° 

TGAhj ; , i Go 


1 * A A 

1480 


1 A A A 

1490 


ioOO 


ri'r A AT A A/"* A. 

TTCCCTACGA 


\ r T~**^ a a A"^"^ 

ACTTtAAC :u 


CAAAGAC i A-^ 


1540 


1550 


IdoO 


A ^ ^ A A T A r r A 

bouAATGGCA 


A Tr A AAA "7 — r ^ 

CTCCAGGTTA 


rTA r^^^r a ^ 

CTACs.^ .uAl 


1600 


1 A T A 

1610 




<r+ r* A I T' r r A /"» "T" 

uuATTGCAG i 


a rrArTATrT 

ACGAGTATCT 


t ■ r a a t a Tr 

CTTCAATATG 


1660 


1670 


1680 


ACCCCAGCGA 


i CCATTCATC 


A A ^*/^ a A » A 

ATATGGAGAC 


1720 


1730 


1740 


ACCGCCGACT 


CAGTTACCAA 


TGuv.TCTGAG 


1780 


1790 


1800 


TACTCCTACA 


ACCTCCATAA 


AGCAACCTTC 


1840 


1850 


1860 


A A X* A A A y*^^^^ A 

AACAAACGTA 


ACTTCATCCT 


CGGACGTGGT 


1900 


1910 


1920 


CTCTGGACTG 


GAGATAACGC 


AAGTACGTGG 


1960 


1970 


1980 


CTTTCTCTAG 


GTCTCAATGG 


TGTGTGTATA 


2020 


2030 


2040 


GCACGTACTG 


AGATTGGGGA 


GGAGAAATAT 


2080 


2090 


2100 


ACTGGATCAT 


TCCTTTTGCC 


ATGGCTTAGA 


2140 


2150 


2160 


TTCCAGGAAC 


CATACGCGTA 


CCCCAAGCAT 


2200 


2210 


2220 


GCATGGCTTT 


ACAAATCTGT 


TCTAGAAATT 


2260 


2270 


2280 


CTCATCCAGC 


TCCTTTACGA 


CTGCATGTTC 


2320 


2330 


2340 


AGATCTATGC 


TCTTGACCGA 


TACTGAGGAT 


2380 


2390 


2400 


CTCGATAACC 


AATATATGGC 


TGGTGACGAC 


2440 


2450 


2460 


AACGAGGTTC 


CGGGAGAGAA 


CAGAGATGTC 


2500 


2510 


2520 


TCAAACTTGA 


GACCGTGGGA 


CGATCAGGGA 


2560 


2570 


2580 


AGCGTTATCA 


ACTACACTGC 


CAGGATTGTT 


2620 


2630 


2640 


GTGGTGCCGG 


TCTACATCAG 


AGAGGGTGCC 


2680 


2690 


2700 


ATTGGCGAAG 


GAGGGCCTAA 


TCCCATCAAG 


2740 


2750 


2760 


TATGTGACGT 


ACCTTGATGA 


TGGTGTTAGC 


2800 


2810 


2820 


TACCGCGAGG 


CCTATGAGCA 


AGCGAAGGTC 


2860 


2870 


2880 
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FIGURE 10 CONTINUED 

GAAGGCAAAG ACGTCCAGAA GCAACTTGCG GTCATTCAAG GGAATAAGAC TAATGACTTC 

2890 2900 2910 2920 2930 2940 

TCCGCCTCCG GGATTGATAA GGAGGCAAAG GGT7ATCACC GCAAAGTTTC TATCAAACAG 

2950 2960 2970 2980 2990 3000 

GAGTCAAAAG ACAAGACCCG TACTGTCACC ATTGAGCCAA AACACAACGG ATACGACCCC 

3010 3020 3030 3040 3050 3060 

TCTAAGGAAG TTGGTAATTA TTATACCATC ATTCTTTGGT ACGCACCGGG CTTTGACGGC 

3070 3080 3090 3100 3110 3120 

AGCATC3TCG ATGTGAGCCA GGCGACCGTG AACATCGAGG GCGGGGTGGA ATGCGAAATT 

3130 3140 3150 3160 3170 3180 

TTCAAGAACA CCGGCTTGCA XACGGTTGTA GTCAACGTGA AAGAGGTGAT CGGTACCACA 

3190 3200 3210 

AAGTCCGTCA AGATCACTTG CACTACCGCT TAG 



WO 96/29416 



PCT/EP96/01009 



27/38 



FIG. 11 




Xbai 4.281 
SicCI 4.638 
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FIG. 13 
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FIG. 14 




Xss-Amy 
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